我收到一个以制表符分隔的文件,并以默认字符集“Unicode”打开。我的理解是“Unicode”可能指的是UTF-16。
当我尝试使用此命令打开此文件时:
CSV.foreach(file, :col_sep => "\t", :headers => true) do |column|
puts column[0]
end
我收到以下错误:
invalid byte sequence in UTF-8
我知道如果我打开这个文件并将其保存为“UTF-8”它会正常工作,但我不能手动打开文件并且每次都这样做。我怎样才能解决这个错误?
编辑:
当传入:encoding: 'UTF-16BE'
下面的每个stefans请求时,我会收到:
invalid byte sequence in UTF-16BE
也许我传递了错误的编码选项?
EDIT2:
传入:encoding => 'ISO-8859-1'
时,我收到此错误:
Illegal quoting in line 1. (CSV::MalformedCSVError)
我文件中的第1行如下:
"Status" "Internal ID" "Language" "Created At" "Updated At" "IP Address" "Location" "Username" "GET Variables" "Referrer" "Number of Saves" "Weighted Score" "Completion Time" "Invite Code" "Invite Email" "Invite Name" "Invite: branchid" "Invite: lastname" "Invite: clientname" "Invite: membershipid" "Invite: clientid" "Invite: dateofbirth" "Invite: membershiptype" "Invite: branch" "Invite: unitid" "Invite: shortname" "Invite: changedatetime" "Invite: homephone" "Collector"
我尝试输入quote_char
,但我得到了同样的错误。我的代码现在看起来像这样:
CSV.foreach(file, :col_sep => "\t", :encoding => 'ISO-8859-1', :quote_char => '"', :headers => true) do |column|
puts column[0]
end