Question

我在Python中遇到__future__.unicode_literals的奇怪问题。在不导入unicode_literals的情况下，我得到了正确的输出：

# encoding: utf-8
# from __future__ import unicode_literals
name = 'helló wörld from example'
print name

但是当我添加unicode_literals导入时：

# encoding: utf-8
from __future__ import unicode_literals
name = 'helló wörld from example'
print name

我收到了这个错误：

UnicodeEncodeError: 'ascii' codec can't encode character u'\xf3' in position 4: ordinal not in range(128)

unicode_literals是否将每个字符串编码为utf-8？我该怎么做才能覆盖这个错误？

Answer 1

您的终端或控制台未能让Python知道它支持UTF-8。

如果没有from __future__ import unicode_literals行，则构建一个包含UTF-8编码字节的字节字符串。使用字符串构建unicode字符串。

print必须区别对待这两个值;字节字符串写入sys.stdout不变。首先将unicode字符串编码为字节，然后Python查询sys.stdout.encoding。如果您的系统没有正确告诉Python它支持哪种编解码器，则默认使用ASCII。

您的系统无法告诉Python使用哪种编解码器; sys.stdout.encoding设置为ASCII，并将unicode值编码为打印失败。

您可以在打印时通过手动编码为UTF-8来验证这一点：

# encoding: utf-8
from __future__ import unicode_literals
name = 'helló wörld from example'
print name.encode('utf8')

你也可以通过创建没有from __future__ import语句的unicode文字来重现这个问题：

# encoding: utf-8
name = u'helló wörld from example'
print name

其中u'..'也是unicode文字。

没有关于您的环境的详细信息，很难说解决方案是什么;这在很大程度上取决于所使用的操作系统和控制台或终端。