我想使用Ruby的CSV类读取文件:
要读取的文件如下:
CM_ SG_ 1325 XXX_Address "XXX address";
CM_ SG_ 612 YYY_MsgCounter "incremented by 1 each time a
message has been transmitted";
我的红宝石代码:
#!/usr/bin/env ruby
require 'pp'
require 'csv'
CSV.foreach(ARGV[0],:col_sep=>" ") do |row|
pp row
end
这是我得到的错误:
C:/ruby-2.3.3-x64-mingw32/lib/ruby/2.3.0/csv.rb:1898:in `block in shift': Unclosed quoted field on l
ine 1. (CSV::MalformedCSVError)
from C:/ruby-2.3.3-x64-mingw32/lib/ruby/2.3.0/csv.rb:1805:in `loop'
from C:/ruby-2.3.3-x64-mingw32/lib/ruby/2.3.0/csv.rb:1805:in `shift'
from C:/ruby-2.3.3-x64-mingw32/lib/ruby/2.3.0/csv.rb:1747:in `each'
from C:/ruby-2.3.3-x64-mingw32/lib/ruby/2.3.0/csv.rb:1131:in `block in foreach'
from C:/ruby-2.3.3-x64-mingw32/lib/ruby/2.3.0/csv.rb:1282:in `open'
from C:/ruby-2.3.3-x64-mingw32/lib/ruby/2.3.0/csv.rb:1130:in `foreach'
from test.rb:4:in `<main>'
如果我删除行尾的分号,我会得到:
["CM_", "SG_", "1325", "XXX_Address", "XXX address"]
["CM_",
"SG_",
"612",
"YYY_MsgCounter",
"incremented by 1 each time a \r\nmessage has been transmitted"]
这是我期望看到的。
我假设问题是CSV不喜欢分号和引号。有没有一种方法可以使用CSV选项删除该分号,或者在我已经删除该分号的地方为CSV提供流?
说明:
很抱歉,我没有明确指定此名称,但不是每行都会有分号。
此外,我还要感谢Tin Man对我的帖子进行了多余的编辑,以提高他的得分。 ;)
答案 0 :(得分:1)
由于您知道每一行都以分号结尾,因此只需指定行分隔符即可,例如
CSV.foreach(ARGV[0],col_sep:" ", row_sep:";").to_a
#=> [["CM_", "SG_", "1325", "XXX_Address", "XXX address"],
# ["CM_", "SG_", "612", "YYY_MsgCounter", "incremented by 1 each time a message has been transmitted"]]
您将丢失该行中的新行,不确定该行是否重要
请注意,根据我与@iGian的讨论,该解决方案适用于<2.6.0的红宝石,而他的解决方案适用于> = 2.6.0的红宝石
答案 1 :(得分:0)
尝试一下,对于Ruby 2.6.1 :
require 'pp'
require 'csv'
CSV.foreach(ARGV[0], col_sep: ' ', row_sep: :auto, liberal_parsing: {double_quote_outside_quote: true} ) do |row|
pp row
end
似乎可行。看到此问题:https://github.com/ruby/csv/issues/66