通过正则表达式从配置文件中提取路径和URL

时间:2011-07-24 11:34:47

标签: ruby regex

我有一个gitmodules这样的文件:

[submodule "dotfiles/vim/bundle/cucumber"]
path = dotfiles/vim/bundle/cucumber
url = git://github.com/tpope/vim-cucumber.git
[submodule "dotfiles/vim/bundle/Command-T"]
path = dotfiles/vim/bundle/Command-T
url = git://github.com/vim-scripts/Command-T.git

我想要做的是为每个子模块获取路径和url作为哈希或其他结构,它们将保存数据:

子模块:黄瓜(路径 - >'路径',网址 - >'网址')

我怎样才能使用正则表达式?或者可能有更有效的方法来解析这种文件?

2 个答案:

答案 0 :(得分:1)

这种文件格式是标准的,所以我想有一个宝石或其他代码可以解析它。另一方面,它很容易解析和封装这样的小文本问题,这是开发的“有趣部分”,那么为什么不重新发明轮子呢?这有点像玩游戏......

require 'pp'

def scangc
  result = h = {}
  open '../.gitconfig', 'r' do |f|
    while s = f.gets
      s.strip!
      if s[0..0] == '['
        result[s[1..-2].to_sym] = h = Hash.new
        next
      end
      raise 'expected =' unless s['=']
      a = s.strip.split /\s+=\s+/
      h[a[0].to_sym] = a[1]
    end
  end
  pp result
end

scangc

答案 1 :(得分:0)

我会在python中这样做:

import re
x = """[submodule "dotfiles/vim/bundle/cucumber"]
path = dotfiles/vim/bundle/cucumber
url = git://github.com/tpope/vim-cucumber.git
[submodule "dotfiles/vim/bundle/Command-T"]
path = dotfiles/vim/bundle/Command-T
url = git://github.com/vim-scripts/Command-T.git"""

submodules = re.findall("\[submodule.*/(.*)\"\]",x)
paths = re.findall("path\s*=\s*(.*)",x)
urls = re.findall("url\s*=\s*(.*)",x)
group = zip(submodules,zip(paths,urls))
submodule_dict = dict([(z[0],{'path':z[1][0],'url':z[1][1]}) for z in group])

将submodule_dict创建为

{'Command-T': {'path': 'dotfiles/vim/bundle/Command-T',
               'url': 'git://github.com/vim-scripts/Command-T.git'},
 'cucumber': {'path': 'dotfiles/vim/bundle/cucumber',
              'url': 'git://github.com/tpope/vim-cucumber.git'}}