使用Paperclip上传,将编码UTF-16LE转换为UTF-8,使用Regex解析并创建新条目。
我正在处理一些代码,这些代码在保存之前运行,以解析保存的windows / osx系统配置文件中的信息。 Windows系统配置文件使用UTF-16LE编码(这是文本管理员所说的),当我使用gsub或regex时,它会抛出无效的字节错误。我使用ruby 2.2.1
file = '/Users/User/Desktop/HTPC-system.txt'
File.open(file) do |file|
file.each_line do |line|
puts line.gsub(/\bSystem\b/)
end
end
返回并
#<Enumerator:0x007fdb19828240>
#<Enumerator:0x007fdb19828ad8>
#<Enumerator:0x007fdb1982a0b8>
#<Enumerator:0x007fdb1982aba8>
#<Enumerator:0x007fdb1a0eff40>
#<Enumerator:0x007fdb1a0efe00>
我也试过这个
open_file = '/Users/User/Desktop/HTPC-system.txt'
encoded_file = open_file.encode('UTF-16LE').encode('utf-8')
File.open('/Users/ncrmro/Downloads/HTPC-ncrmro.txt') do |file|
file.each_line do |line|
puts line.gsub(/\bSystem\b/, "")
end
end
这引发了这个错误。
/Users/User/Desktop/TextParse.rb:10:in `gsub': invalid byte sequence in UTF-8 (ArgumentError)
from /Users/User/Downloads/TextParse.rb:10:in `block (2 levels) in <main>'
from /Users/User/Downloads/TextParse.rb:9:in `each_line'
from /Users/User/Downloads/TextParse.rb:9:in `block in <main>'
from /Users/User/Downloads/TextParse.rb:8:in `open'
from /Users/User/Downloads/TextParse.rb:8:in `<main>'