我正在使用普通标志将文件保存到ext2分区。检索他们的名字有时会失败:
s = `find "#{@dir}" -type f -printf "%T@\t::::\t%s\t::::\t%p\n" |sort`
s.each_line {|l|
file_name = l.chomp.split("\t::::\t")[2] #=>
# ...66:in `split': invalid byte sequence in UTF-8 (ArgumentError)
}
试验:
l.encoding #=> UTF-8
l.valid_encoding #=> false
l.inspect #=> "...St. Paul\xE2%80%99s Cathedral..."
Iconv.conv('utf-8', 'utf-8', l) #=>
# ...77:in `conv': "\xE2%80%99s Cathedr"... (Iconv::IllegalSequence)
如何获取文件名并删除文件?
忘记提及,在bash文件看起来像:
index.php?attTag=St. Paul?%80%99s Cathedral
将此字符串粘贴回ls将不返回此类文件或目录。
答案 0 :(得分:1)
您可以在运行转换之前尝试CGI.unescape
...
a = "...St. Paul\xE2%80%99s Cathedral..."
puts a
require 'cgi'
b = CGI.unescape a
puts b
require 'iconv'
c = Iconv.conv('UTF-8//TRANSLIT', 'UTF-8', b) # may not even be necessary
puts c
我的ruby-1.9.2-p180上的哪些输出:
...St. Paul?%80%99s Cathedral...
...St. Paul’s Cathedral...
...St. Paul’s Cathedral...