Ruby CSV.parse在遇到引号时非常挑剔

时间:2012-04-01 15:55:22

标签: ruby csv fastercsv

我发现Ruby 1.9.3中的CSV解析非常脆弱。这么多,以至于我想知道我做错了什么

如果我在irb中执行以下操作,则会收到错误消息:

1.9.3-p125 :011 > require 'csv'
 => true
1.9.3-p125 :012 > a = 'one,two,three, "four, five",six'
 => "one,two,three, \"four, five\",six" 
1.9.3-p125 :013 > arr = CSV.parse(a)
CSV::MalformedCSVError: Illegal quoting in line 1.
    from /Users/disaacs/.rvm/rubies/ruby-1.9.3-p125/lib/ruby/1.9.1/csv.rb:1925:in `block (2 levels) in shift'
    from /Users/disaacs/.rvm/rubies/ruby-1.9.3-p125/lib/ruby/1.9.1/csv.rb:1887:in `each'
    from /Users/disaacs/.rvm/rubies/ruby-1.9.3-p125/lib/ruby/1.9.1/csv.rb:1887:in `block in shift'
    from /Users/disaacs/.rvm/rubies/ruby-1.9.3-p125/lib/ruby/1.9.1/csv.rb:1849:in `loop'
    from /Users/disaacs/.rvm/rubies/ruby-1.9.3-p125/lib/ruby/1.9.1/csv.rb:1849:in `shift'
    from /Users/disaacs/.rvm/rubies/ruby-1.9.3-p125/lib/ruby/1.9.1/csv.rb:1791:in `each'
    from /Users/disaacs/.rvm/rubies/ruby-1.9.3-p125/lib/ruby/1.9.1/csv.rb:1805:in `to_a'
    from /Users/disaacs/.rvm/rubies/ruby-1.9.3-p125/lib/ruby/1.9.1/csv.rb:1805:in `read'
    from /Users/disaacs/.rvm/rubies/ruby-1.9.3-p125/lib/ruby/1.9.1/csv.rb:1379:in `parse'
    from (irb):13
    from /Users/disaacs/.rvm/rubies/ruby-1.9.3-p125/bin/irb:16:in `<main>'

我发现问题是“四,五”值之前的额外空间。如果我删除了空格,那就可以了。

1.9.3-p125 :010 > a = 'one,two,three,"four, five",six'
 => "one,two,three,\"four, five\",six" 
1.9.3-p125 :011 > arr = CSV.parse(a)
 => [["one", "two", "three", "four, five", "six"]]

其他值前面的空格不会导致问题。以下解析得很好

one, two, three,"four, five", six

是否有一些我错过的解析选项会使引用值变得如此脆弱?

1 个答案:

答案 0 :(得分:3)

这是正确的行为。它并不脆弱。

你的逗号&#34;四&#34;正在结束该字段,下一个字段立即以空格开始。

您无法在字段中间有效地放置引号(无需转义)。