Question

当我在IPython笔记本中执行以下操作时

s='½'
s
print s
print [s]

我看到了

'\xc2\xbd'
½
['\xc2\xbd']

这里发生了什么事？
如何打印Unicode字符串列表？（即我想看['½']）

修改所以从评论来看，不同之处在于“print s”使用s.__str__和“s”，“print [s]”使用它s.__repr__

Answer 1

您可以使用repr函数创建包含列表的可打印表示的字符串，然后使用string-escape编码对字符串进行解码，该编码将返回字符串的字节字符串。然后通过打印字节字符串，您的终端将通过它的默认编码（通常是UTF8）自动编码：

>>> print repr([s]).decode('string-escape')
['½']

但请注意，因为在python 3.X中我们只有unicode，你不需要使用这个技巧：

Python 3.4.3 (default, Oct 14 2015, 20:28:29) 
[GCC 4.8.4] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> 
>>> s='½'
>>> print ([s])
['½']

有关python编码的更多信息，请阅读https://docs.python.org/2.4/lib/standard-encodings.html

Answer 2

//unique-identifier (function() { var ScriptUrl; function init() { console.log('init'); } if( window.jQuery ) { jQuery( document ).ajaxComplete(function(event, jqXHR, ajaxOptions) { if( /^\/\/unique-identifier/.test(jqXHR.responseText) ) { ScriptUrl = ajaxOptions.url; init(); } }); } else { //do some error handling or support an altnerative loader like requirejs } }());是Python 2上public class HelloOkEndpoint : OwinMiddleware { public HelloOkEndpoint(OwinMiddleware next) : base(next) { } public override Task Invoke(IOwinContext context) { IOwinRequest request = context.Request; IOwinResponse response = context.Response; if (request.Path.Value.ToLower().Contains("hello.ashx")) { response.Body = new MemoryStream(System.Text.Encoding.UTF8.GetBytes("Ok!")); response.StatusCode = 200; } return Next.Invoke(context); } }的可打印表示形式，其中不可打印（isprint()为'\xc2\xbd'）字节被替换为其十六进制代码，例如{{1} }}字节显示为bytes中的0。

这里发生了什么？

Python bytestring literals

sys.displayhook

{REP> 0xc2
\xc2按原样放置字节，并且您的编辑器和控制台使用兼容的编码，因此您没有mojibake：字节映射到相同的字形：{{1 （Unicode代码点：repr(s)）
s打印列表（它调用print s）。每个列表项都会调用½。

如何打印Unicode字符串列表？（即我想看['½']）

使用Unicode处理文本：

（a）特别使用Unicode字符串文字而不是字节字符串文字：添加print [s]或使用str(your_list)前缀：repr(item)

（b）声明源代码的字符编码，在顶部添加：from __future__ import unicode_literals（注意：它只影响源代码;它与字符编码无关在运行时使用）

要将Unicode字符串列表打印为文本，请先将其序列化为字符串：

u''

如果您需要与其他程序交换数据;你可以使用JSON格式：

s = u'½'

请勿使用# -*- coding: utf-8 -*-，而是修改数据格式。

“print s”与“print [s]”中使用的编码不同？

2 个答案: