Question

我正在尝试让Mako渲染一些带有unicode字符的字符串：

tempLook=TemplateLookup(..., default_filters=[], input_encoding='utf8',output_encoding='utf-8', encoding_errors='replace')
...
print sys.stdout.encoding
uname=cherrypy.session['userName']
print uname
kwargs['_toshow']=uname
...
return tempLook.get_template(page).render(**kwargs)

相关模板文件：

...${_toshow}...

输出是：

UTF-8
Deşghfkskhü
...
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc5 in position 1: ordinal not in range(128)

我认为字符串本身没有任何问题，因为我可以将它打印得很好。

尽管我已经玩了很多input/output_encoding和default_filters个参数，但它总是抱怨无法用ascii编解码器进行解码/编码。

所以我决定尝试在documentation上找到的示例，以下是“最好的”：

input_encoding='utf-8', output_encoding='utf-8'
#(note : it still raised an error without output_encoding, despite tutorial not implying it)

使用

${u"voix m’a réveillé."}

结果是

voix mâ�a rÃ©veillÃ©

我根本不明白为什么这不起作用。 “魔术编码评论”也不起作用。所有文件都使用UTF-8进行编码。

我花了几个小时无济于事，我错过了什么？

<击> 更新：

~~我现在有一个更简单的问题：~~

~~既然所有的变量都是unicode，我怎样才能让Mako在不应用任何东西的情况下渲染unicode字符串？传递空白过滤器/ render_unicode（）没有帮助。~~

Answer 1

是的，UTF-8！= Unicode。

UTF-8是一种特定的字符串编码，ASCII和ISO 8859-1也是如此。试试这个：

对于任何输入字符串，执行inputstring.decode('utf-8')（或您获得的任何输入编码）。对于任何输出字符串，执行outputstring.encode('utf-8')（或任何您想要的输出编码）。对于任何内部使用，请使用unicode字符串（'this is a normal string'.decode('utf-8') == u'this is a normal string'）

'foo'是一个字符串，u'foo'是一个unicode字符串，它没有“拥有”编码（无法解码）。所以当python想要改变普通字符串的编码时，它首先尝试“解码”它，然后“编码”它。默认为“ascii”，它经常失败： - ）

Python / Mako：如何正确解析unicode字符串/字符？

1 个答案: