Question

使用mechanize，我检索了包含一些非ASCII字符的网页源页面，例如中文字符。

代码如下：

#using python2.6
from mechanize import Browser

br = Browser()
br.open("http://www.example.html")

src = br.reponse().read()  #retrieve the source of the web

print src   #print the src

问题：

1.根据页面的来源，我可以看到它的charset=gb2312，但是当我print src时，所有内容都是正确的，我的意思是没有胡言乱语。为什么？ print知道src的编码吗？

2.我应该明确解码或编码src吗？

Answer 1

src是unicode，没有编码。 print（或更准确地说，sys.stdout.write()）确定输出时要使用的编码。

python编码

1 个答案: