Question

我已经创建了一个打印出一些html内容的程序。我的源文件是utf-8，服务器终端是utf-8，我也用：

out = out.encode('utf8')

确保字符链在utf8中。尽管如此，当我在字符串中使用“ã”，“é”等字符时，我得到：

UnicodeEncodeError: 'ascii' codec can't encode character '\xe3' in position 84: ordinal not in range(128)

在我看来打印之后：

print("Content-Type: text/html; charset=utf-8 \n\n")

它被迫使用ASCII编码......但是，我只是不知道会是这种情况。

Answer 1

我猜您应该将该文件作为unicode对象读取，这样您可能不需要对其进行编码。

import codecs
file = codecs.open('file.html', 'w', 'utf-8')

Answer 2

非常感谢。

这就是我用Python 3.4.1解决编码问题的方法：首先，我在代码中插入了这一行来检查输出编码：

print(sys.stdout.encoding)

我看到输出编码是：

ANSI_X3.4-1968 -

代表ASCII并且不支持'ã'，'é'等字符

所以，我删除了前一行，并在这里插入了这些，以更改标准输出编码与这些行

import codecs

if sys.stdout.encoding != 'UTF-8':
    sys.stdout = codecs.getwriter('utf-8')(sys.stdout.buffer, 'strict')
if sys.stderr.encoding != 'UTF-8':
    sys.stderr = codecs.getwriter('utf-8')(sys.stderr.buffer, 'strict')

以下是我找到信息的地方：

http://www.macfreek.nl/memory/Encoding_of_Python_stdout

P.S。：每个人都说更改默认编码不是一个好习惯。我真的不知道。在我的情况下，它对我来说很好，但我正在构建一个非常小而简单的webapp。

Python中的编码问题 - 'ascii'编解码器在使用UTF-8时无法对字符'\ xe3'进行编码

2 个答案: