Question

我遇到了诸如

之类的错误

UnicodeEncodeError('ascii', u'\x01\xff \xfeJ a z z', 1, 2, 'ordinal not in range(128)'

我也得到了诸如

之类的序列

u'\x17\x01\xff \xfeA r t   B l a k e y'

我将\ x01 \ xff \ xfe识别为BOM，但如何将这些转换为明显的输出（Jazz和Art Blakey）？

这些来自一个读取音乐文件标签的程序。

我尝试了各种编码，例如s.encode（'utf8'），各种解码后跟编码，但没有成功。

根据要求：

from hsaudiotag import auto
inf = 'test.mp3'
song = auto.File(inf)
print song.album, song.artist, song.title, song.genre

> Traceback (most recent call last):   File "audio2.py", line 4, in
> <module>
>     print song.album, song.artist, song.title, song.genre   File "C:\program files\python27\lib\encodings\cp437.py", line 12, in encode
>     return codecs.charmap_encode(input,errors,encoding_map) UnicodeEncodeError: 'charmap' codec can't encode character u'\xfe' in
> position 4 : character maps to <undefined>

如果我将print语句更改为

with open('x', 'wb') as f:
    f.write(song.genre)

我得到了

Traceback (most recent call last):
  File "audio2.py", line 6, in <module>
    f.write(song.genre)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xff' in position 1:
ordinal not in range(128)

Answer 1

对于您的实际问题，您需要将字节而不是字符写入文件。拨打：

f.write(song.genre.encode('utf-8'))

你不会得到错误。您可以使用io.open来获取可以自动编码的字符流，即：

with io.open('x', 'wb', encoding='utf-8') as f:
    f.write(song.genre)

将Unicode添加到控制台可能会遇到一些困难（特别是在Windows下） - 查看PrintFails。

然而，正如评论中所讨论的那样，你所拥有的东西看起来并不像一个工作标签值......它看起来更像是一个错位的ID3v2帧数据块，它可能无法恢复。我不知道这是你的标签阅读库中的错误还是只有一个带有垃圾标签的文件。

另一个python unicode错误

1 个答案: