Question

使用str.decode和unicode.encode的正确方法是什么？

例如

print str.decode
print unicode.encode

Answer 1

Ignacio的示例是正确的，但取决于您的控制台能够显示Unicode字符，在Windows上它通常不能。这里只有安全的字符串转义（reprs）：

>>> '\xe3\x81\x82'.decode('utf-8')    # three top-bit-set bytes, representing one character
u'\u3042'                             # Hiragana letter A

>>> u'\u3042'.encode('shift-jis')
'\x82\xa0'                            # only requires two bytes in the Shift-JIS encoding

>>> unicode('\x82\xa0', 'shift-jis')  # alternative way of doing a decode
u'\u3042'

当你写信给你的时候。一个文件或通过Web服务器，或者你在控制台支持UTF-8的另一个操作系统上，它更容易。

Answer 2

print 'あ'.decode('utf-8')
print repr(u'あ'.encode('shift-jis'))

Answer 3

>>> unicode.encode(u"abcd","utf8")
'abcd' #unicode string u"abcd" got encoded to UTF-8 encoded string "abcd"

>>> str.decode("abcd","utf8")
u'abcd' #UTF-8 string "abcd" got decoded to python's unicode object u"abcd"
>>>

使用str.decode和unicode.encode的正确方法是什么？

3 个答案: