Question

我想确保我的代码中的所有字符串都是unicode，所以我使用unicode_literals，然后我需要将字符串写入文件：

from __future__ import unicode_literals
with open('/tmp/test', 'wb') as f:
    f.write("中文") # UnicodeEncodeError

所以我需要这样做：

from __future__ import unicode_literals
with open('/tmp/test', 'wb') as f:
    f.write("中文".encode("utf-8"))
    f.write("中文".encode("utf-8"))
    f.write("中文".encode("utf-8"))
    f.write("中文".encode("utf-8"))

但每次我需要在代码中编码时，我都很懒，所以我改为编解码器：

from __future__ import unicode_literals
from codecs import open
import locale, codecs
lang, encoding = locale.getdefaultlocale()

with open('/tmp/test', 'wb', encoding) as f:
    f.write("中文")

如果我只是想写文件，那么我认为这太过分了吗？

Answer 1

您无需致电.encode()，也无需明确致电locale.getdefaultlocale()：

#!/usr/bin/env python
# -*- coding: utf-8 -*-
import io

with io.open('/tmp/test', 'w') as file:
    file.write(u"中文" * 4)

它使用locale.getpreferredencoding(False)字符编码将Unicode文本保存到文件中。

在Python 3上：

您不需要使用显式编码声明（# -*- coding: utf-8 -*-），在Python源代码中使用文字非ascii字符。 utf-8是默认设置。
您无需使用import io：内置open() io.open()
您不需要使用u''（u前缀）。 ''文字默认为Unicode。如果您想省略u''，请在问题代码中放回from __future__ import unicode_literals。

即，完整的Python 3代码是：

#!/usr/bin/env python3

with open('/tmp/test', 'w') as file:
    file.write("中文" * 4)

Answer 2

这个解决方案怎么样？

Write to UTF-8 file in Python

只有三行代码。

python写unicode容易文件？

2 个答案: