Question

当我尝试在Linux上打印Unicode字符串时，我收到my_str = u'\u4ece\u5165\u5e93' print "%r" % my_str #output: u'\u4ece\u5165\u5e93' print "%s" % my_str #output: UnicodeEncodeError: 'ascii' codec can't encode character u'\u4ece' in position 0: ordinal not in range(128)异常。在Windows上我没有收到错误。

在Linux上执行的代码：

    my_str = u'\u4ece\u5165\u5e93'
    print "%r"  % my_str #output: u'\u4ece\u5165\u5e93' 
    print "%s" % my_str #output: 从入库

在Windows上我得到：

@Inject
public Handler(SharedStringTable sst) {
    this.sst = sst
}

Answer 1

您的locale和/或环境很可能已损坏，未安装，未设置或设置为C。 Python使用语言环境设置在stdout上应用正确的编码器。这允许将Unicodes编码为适当的编码。

如果您从命令行运行Python，请确保您的语言环境正常。输入locale，您应该看到类似的内容：

 $ locale
LANG=en_GB.UTF-8
LANGUAGE=
LC_CTYPE="en_GB.UTF-8"
LC_NUMERIC="en_GB.UTF-8"
LC_TIME="en_GB.UTF-8"
LC_COLLATE="en_GB.UTF-8"
LC_MONETARY="en_GB.UTF-8"
LC_MESSAGES="en_GB.UTF-8"
LC_PAPER="en_GB.UTF-8"
LC_NAME="en_GB.UTF-8"
LC_ADDRESS="en_GB.UTF-8"
LC_TELEPHONE="en_GB.UTF-8"
LC_MEASUREMENT="en_GB.UTF-8"
LC_IDENTIFICATION="en_GB.UTF-8"
LC_ALL=
 $

如果您看到错误消息或LANG = C或类似信息，Python将使用ASCII编码器，它拒绝非ASCII字符。

要查找系统上安装的区域设置，请键入locale -a。选择适当的区域设置，最好选择以“UTF-8”结尾的区域设置，并相应地设置LANG。 E.g。

LANG=en_GB.UTF-8

再次运行locale并检查错误。如果仍然出现错误，那么您需要研究如何为您的发行版重建区域设置。

如果您在IDE中运行或者无法修复，那么您可能会成功将以下环境变量添加到shell或IDE运行配置中：

export PYTHONIOENCODING=utf-8

这告诉Python忽略语言环境并将UTF-8编码器应用于stdout。

您可以使用Python中的locale模块验证Python用于语言环境的内容。我健康的语言环境返回：

>>> import locale
>>> locale.getdefaultlocale()
('en_GB', 'UTF-8')
>>> locale.getpreferredencoding()
'UTF-8'

不健康的区域设置将返回US-ASCII

的locale.getpreferredencoding()

Answer 2

你可以尝试：

print u"{0}".format(str)

或

print u"{0}".format(l.decode('utf-8'))

Linux上的UnicodeEncodeError，但不适用于Windows

2 个答案: