Question

我有一个字符串，如：

"MÃ\u0083Â¼LLER".encoding
#<Encoding:UTF-8>   

"MÃ\u0083Â¼LLER".inspect    
"\"MÃ\\u0083Â¼LLER\""

我可以做些什么来挽救这样的字符串？考虑到我没有原始数据。这可以挽救吗？

Answer 1

看起来字符串已从utf-8转换为latin-1 两次。试试你的一些数据，并告诉我它是否有效：

require 'iconv'

def decode(str)
  i = Iconv.new('LATIN1','UTF-8')
  i.iconv(i.iconv(str)).force_encoding('UTF-8')
end

decode("MÃ\u0083Â¼LLER")
#=> "MüLLER"

将unicode mess转换为Ruby中的正确字符？

1 个答案: