UTF中的字节序列无效 - ?使用Ruby

时间:2013-10-12 22:13:56

标签: ruby csv encode

我收到一个以制表符分隔的文件,并以默认字符集“Unicode”打开。我的理解是“Unicode”可能指的是UTF-16。

当我尝试使用此命令打开此文件时:

CSV.foreach(file, :col_sep => "\t", :headers => true) do |column|
    puts column[0]
end

我收到以下错误:

invalid byte sequence in UTF-8

我知道如果我打开这个文件并将其保存为“UTF-8”它会正常工作,但我不能手动打开文件并且每次都这样做。我怎样才能解决这个错误?

编辑:

当传入:encoding: 'UTF-16BE'下面的每个stefans请求时,我会收到:

invalid byte sequence in UTF-16BE

也许我传递了错误的编码选项?

EDIT2:

传入:encoding => 'ISO-8859-1'时,我收到此错误:

Illegal quoting in line 1. (CSV::MalformedCSVError)

我文件中的第1行如下:

"Status"    "Internal ID"   "Language"  "Created At"    "Updated At"    "IP Address"    "Location"  "Username"  "GET Variables" "Referrer"  "Number of Saves"   "Weighted Score"    "Completion Time"   "Invite Code"   "Invite Email"  "Invite Name"   "Invite: branchid"  "Invite: lastname"  "Invite: clientname"    "Invite: membershipid"  "Invite: clientid"  "Invite: dateofbirth"   "Invite: membershiptype"    "Invite: branch"    "Invite: unitid"    "Invite: shortname" "Invite: changedatetime"    "Invite: homephone" "Collector" 

我尝试输入quote_char,但我得到了同样的错误。我的代码现在看起来像这样:

CSV.foreach(file, :col_sep => "\t", :encoding => 'ISO-8859-1', :quote_char => '"', :headers => true) do |column|
    puts column[0]
end

0 个答案:

没有答案