Question

我看过几篇与此有关的帖子，但没有明确答案。假设我想在仅支持ASCII的终端（例如s=u'\xe9\xe1'）中打印字符串LC_ALL=C; python3。有没有办法将以下配置为默认行为：

import sys
s = u'\xe9\xe1'
s = s.encode(sys.stdout.encoding, 'replace').decode(sys.stdout.encoding)
print(s)

即，我希望字符串能够打印某些东西 - 甚至是垃圾 - 而不是引发异常（UnicodeEncodeError）。我使用的是python3.5。

我想避免为可能包含UTF-8的所有字符串写这个。

Answer 1

你可以做以下三件事之一：

使用PYTHONIOENCODING environment variable调整stdout和stderr的错误处理程序：
```
export PYTHONIOENCODING=:replace
```
注意:;我没有指定编解码器，只指定错误处理程序。

替换stdout TextIOWrapper，设置不同的错误处理程序：

import sys
import io

sys.stdout = io.TextIOWrapper(
    sys.stdout.buffer, encoding=sys.stdout.encoding, 
    errors='replace',
    line_buffering=sys.stdout.line_buffering)

在TextIOWrapper周围创建一个单独的sys.stdout.buffer实例，并在打印时将其作为file参数传递：

import sys
import io

replacing_stdout = io.TextIOWrapper(
    sys.stdout.buffer, encoding=sys.stdout.encoding, 
    errors='replace',
    line_buffering=sys.stdout.line_buffering)

print(s, file=replacing_stdout)

Python默认字符编码处理

1 个答案: