Question

如何从b'\xe3\x81\x82'获取'\xe3\x81\x82'？

最后，我想要u'\u3042'，这意味着日文字母'あ'，

b'\xe3\x81\x82'.decode('utf-8')会u'\u3042'，但

'\xe3\x81\x82'.decode('utf-8')会导致以下错误

AttributeError: 'str' object has no attribute 'decode'

因为b'\xe3\x81\x82'是字节而'\xe3\x81\x82'是str。

我的数据库包含'\xe3\x81\x82'等数据。

Answer 1

如果您将字节伪装成Unicode代码点，请编码为Latin-1：

'\xe3\x81\x82'.encode('latin1').decode('utf-8')

Latin-1（ISO-8859-1）将Unicode代码点一对一映射到字节：

>>> '\xe3\x81\x82'.encode('latin1').decode('utf-8')
'あ'