将自定义日志文件解析为哈希数组

时间:2011-01-12 14:22:50

标签: ruby parsing

我想解析一个包含3个条目的日志文件。它看起来像这样:

Start: foo
Parameters: foo
End: foo

Start: other foo
Parameters: other foo
End: other foo

....

foo就是我想要的。如果结果如下所示会很好:

logs = [
{
  :start=>"foo",
  :parameters=>"foo",
  :end=>"foo"
},
{
  :start=>"other foo",
  :parameters=>"other foo",
  :end=>"other foo"
}
]

我知道一些正则表达式,但我很难理解我如何通过多行来解决这个问题。 谢谢!

3 个答案:

答案 0 :(得分:5)

执行此操作的最佳方法是使用多行正则表达式:

logs = file.scan /^Start: (.*)\nParameters: (.*)$\nEnd: (.*)$/
#  => [["foo", "foo", "foo"], ["other foo", "other foo", "other foo"]]
logs.map! { |s,p,e|  { :start => s, :parameters => p, :end => e } }
#  => [ {:start => "foo", :parameters => "foo", :end => "foo" }, ... ]

答案 1 :(得分:4)

#!/usr/bin/ruby1.8

require 'pp'

logfile = <<EOS
Start: foo
Parameters: foo
End: foo

Start: other foo
Parameters: other foo
End: other foo
EOS

logs = logfile.split(/\n\n/).map do |section|
  Hash[section.lines.map do |line|
    key, value = line.chomp.split(/: /)
    [key.downcase.to_sym, value]
  end]
end

pp logs
# => [{:end=>"foo", :parameters=>"foo", :start=>"foo"},
# =>  {:end=>"other foo", :parameters=>"other foo", :start=>"other foo"}]

答案 2 :(得分:3)

像Wayne那样将整个日志文件读入内存可能会出现问题。

log = []
h = {}
FasterCSV.foreach("log.log", :col_sep => ":") do |row|
  name, value = *row
  if !name.nil?
    h[name.downcase.to_sym]=value
    if name=="End"
      log<<h
      h={}
    end
  end
end

log
=> [{:end=>" foo", :start=>" foo", :parameters=>" foo"},
    {:end=>" other foo", :start=>" other foo", :parameters=>" other foo"}]