Question

我有一个gitmodules这样的文件：

[submodule "dotfiles/vim/bundle/cucumber"]
path = dotfiles/vim/bundle/cucumber
url = git://github.com/tpope/vim-cucumber.git
[submodule "dotfiles/vim/bundle/Command-T"]
path = dotfiles/vim/bundle/Command-T
url = git://github.com/vim-scripts/Command-T.git

我想要做的是为每个子模块获取路径和url作为哈希或其他结构，它们将保存数据：

子模块：黄瓜（路径 - ＆gt;'路径'，网址 - ＆gt;'网址'）

我怎样才能使用正则表达式？或者可能有更有效的方法来解析这种文件？

Answer 1

这种文件格式是标准的，所以我想有一个宝石或其他代码可以解析它。另一方面，它很容易解析和封装这样的小文本问题，这是开发的“有趣部分”，那么为什么不重新发明轮子呢？这有点像玩游戏......

require 'pp'

def scangc
  result = h = {}
  open '../.gitconfig', 'r' do |f|
    while s = f.gets
      s.strip!
      if s[0..0] == '['
        result[s[1..-2].to_sym] = h = Hash.new
        next
      end
      raise 'expected =' unless s['=']
      a = s.strip.split /\s+=\s+/
      h[a[0].to_sym] = a[1]
    end
  end
  pp result
end

scangc

Answer 2

我会在python中这样做：

import re
x = """[submodule "dotfiles/vim/bundle/cucumber"]
path = dotfiles/vim/bundle/cucumber
url = git://github.com/tpope/vim-cucumber.git
[submodule "dotfiles/vim/bundle/Command-T"]
path = dotfiles/vim/bundle/Command-T
url = git://github.com/vim-scripts/Command-T.git"""

submodules = re.findall("\[submodule.*/(.*)\"\]",x)
paths = re.findall("path\s*=\s*(.*)",x)
urls = re.findall("url\s*=\s*(.*)",x)
group = zip(submodules,zip(paths,urls))
submodule_dict = dict([(z[0],{'path':z[1][0],'url':z[1][1]}) for z in group])

将submodule_dict创建为

{'Command-T': {'path': 'dotfiles/vim/bundle/Command-T',
               'url': 'git://github.com/vim-scripts/Command-T.git'},
 'cucumber': {'path': 'dotfiles/vim/bundle/cucumber',
              'url': 'git://github.com/tpope/vim-cucumber.git'}}

通过正则表达式从配置文件中提取路径和URL

2 个答案: