Question

我通过以下命令在Python3.6.4 64bit中运行SimpleHTTPServer：

python -m http.server --cgi

然后我在test.py中创建一个表单，将其提交给test_form_action.py以打印输入文本。

cgi-bin / test.py

# coding=utf-8
from __future__ import unicode_literals, absolute_import

print("Content-Type: text/html")  # HTML is following
print()
reshtml = '''<!DOCTYPE html>
<html lang="en">
<head>
    <meta http-equiv="Content-Type" content="text/html" charset="utf-8"/>
</head>
<body>
<div style="text-align: center;">
    <form action="/cgi-bin/test_form_action.py" method="POST"
          target="_blank">
        输入:<input type="text" id= "id" name="name"/></td>
        <button type="submit">Submit</button>
    </form>
</div>
</body>
</html>'''

print(reshtml)

的cgi-bin / test_form_action.py

# coding=utf-8
from __future__ import unicode_literals, absolute_import

# Import modules for CGI handling
import cgi, cgitb
cgitb.enable()

if __name__ == '__main__':
    print("Content-Type: text/html")  # HTML is following
    print()

    form = cgi.FieldStorage()
    print(form)
    id = form.getvalue("id")
    name = form.getvalue("name")

    print(id)

当我访问http://127.0.0.1:8000/cgi-bin/test.py时，汉字＆＃34;输入＆＃34;没有表现出来，它看起来像＆＃34;��＆＃34;，我必须手动更改此页面的文本编码＆＃34;的Unicode＆＃34; to＆＃34;简体中文＆＃34;在Firefox中使汉字显得正常。

这很奇怪，因为我把charset =＆＃34; utf-8＆＃34;在cgi-bin / test.py。

此外，当我把一些中文输入表格时，提交。但是cgi-bin / test_form_action.py是空白的。

同时在我运行SimpleHTTPServer的Windows终端中显示了一些错误：

127.0.0.1 - - [23 / Mar / 2018 23:43:32] b＆＃39; sys.excepthook出错：\ r \ nTraceback（最近一次调用最后一次）：\ r \ n文件＆＃34; E：\ Python \ Python36 \ Lib \ cgitb.py＆＃34;，第26行8，在调用 \ r \ n中 self.handle（（etype，evalue，etb））\ r \ n文件＆＃34; E：\ Python \ Python36 \ Lib \ cgitb.py＆＃34;，第288行，在句柄\ r \ n中 self.file.write（doc + \＆＃39; \ n \＆＃39;）\ r \ nUnicodeEncodeError：\＆＃39; gbk \＆＃39;编解码器无法编码字符\＆＃39; \ ufffd \＆＃39;在1894年：非法多字节序列\ r \ n \ r \ n原始异常是：\ r \ nT raceback（大多数最近的呼叫最后）：\ r \ n文件＆＃34; G：\ Python \ Project \ VideoHelper \ cgi-bin \ test_form_action.py＆＃34;，line 13，在\ r \ n print（form）\ r \ nUnico deEncodeError：\＆＃39; gbk \＆＃39; 编解码器无法编码字符\＆＃39; \ ufffd \＆＃39;位置52：非法多字节序列\ r \ n＆＃39; 127.0.0.1 - - [23 / Mar / 2018 23:43:32] CGI脚本退出状态0x1

Answer 1

当您使用print()表达式时，Python会将字符串转换为字节，即。它encode使用默认编解码器。此默认值的选择取决于环境 - 在您的情况下，它似乎是GBK（从错误消息判断）。

在CGI脚本返回的HTML页面中，您将编解码器（“charset”）指定为UTF-8。您当然可以将其更改为GBK，但它只会解决您的第一个问题（显示test.py），而不是第二个问题（test_form_action.py中的编码错误）。相反，让Python在STDOUT上发送UTF-8编码的数据可能更好。

一种方法是替换所有出现的

print(x)

与

sys.stdout.buffer.write(x.encode('utf8'))

或者，您可以使用重新编码的包装替换sys.stdout，而不更改print()次出现次数：

sys.stdout = open(sys.stdout.buffer.fileno(), 'w', encoding='utf8'))

注意：这两个解决方案在Python 2.x中不起作用（您必须省略.buffer部分）。我写这篇是因为你的代码有from __future__ import个语句，这些语句在使用Python 3专门运行的代码中没有用。

Python3.6.4中的SimpleHTTPServer无法处理非ASCII字符串（在我的例子中是中文）

1 个答案: