如何删除逗号中的前导和尾随空格以及字符串中的单引号

时间:2016-04-12 07:47:24

标签: ruby csv

我有一个格式为

的输入csv文件
SN,Date,TIME,Line,MatchedLine
C0001 , Mar 22 , 02:50:25, '"heartbeat"', '"kernel: heartbeat event \n"'
C0002 , Mar 22 , 02:50:25, '"Wait max"', '"kernel:  Wait max 12 seconds and start polling eCM \n"'

我试图将整个csv文件转换为哈希数组,以便每个标题将映射到相应的列,如下所示。

 [{"SN"=>"C0001", "Date"=>"Mar 22", "TIME"=>"02:50:25", "Line"=>"heartbeat", "MatchedLine"=>"kernel: heartbeat event \\n"},
  {"SN"=>"C0002", "Date"=>"Mar 22", "TIME"=>"02:50:25", "Line"=>"Wait max", "MatchedLine"=>"kernel:  Wait max 12 seconds and start polling eCM \\n"}]

我有一个执行此功能的功能

  def csv_to_hash file
    words_list_hash  = []
    csv_file = File.open(file.strip,"r:bom|utf-8")
    csv_data = CSV.read csv_file
    headers = csv_data.shift.map {|i| i.to_s }
    string_data = csv_data.map {|row| row.map {|cell| cell.to_s } } 
    words_list_hash = string_data.map {|row| Hash[*headers.zip(row).flatten] }
    return words_list_hash
end 

但它在读取csv文件时出错(CSV.read csv_file)

 /home/tivo/.rvm/rubies/ruby-1.9.3-p286/lib/ruby/1.9.1/csv.rb:1925:in `block (2 levels) in shift': Illegal quoting in line 3. (CSV::MalformedCSVError)

我发现这是因为输入不当, 如果我在字符串之前删除前导和尾随空格和单引号代码正常工作

 C0001,Mar 22,02:50:25,"heartbeat","kernel: heartbeat event \n"'

谁能告诉我如何将整个csv文件格式化为提到的格式?

1 个答案:

答案 0 :(得分:4)

# read as string
csv = File.read file.strip, "r:bom|utf-8"
# parse and separate header and the rest
head, *rest = csv.split($/).map { |row| row.split(/["']*\s*,\s*['"]*/) }
# produce the hash requested
rest.map { |row| head.zip(row).to_h }
#⇒ [
#  [0] {
#           "Date" => "Mar 22",
#           "Line" => "heartbeat",
#    "MatchedLine" => "kernel: heartbeat event \\n\"'",
#             "SN" => "C0001",
#           "TIME" => "02:50:25"
#  },
#  [1] {
#           "Date" => "Mar 22",
#           "Line" => "Wait max",
#    "MatchedLine" => "kernel:  Wait max 12 seconds and start polling eCM \\n\"'",
#             "SN" => "C0002",
#           "TIME" => "02:50:25"
#  }
# ]