我正在尝试打印文件的内容。 'file -bi filename'命令给出'text / plain;字符集= ISO-8859-1' 。 文件的字符串类似于“ÏÂÔØ,°²×¢¢¢¢¢¢¢¢¡¡¡¡¡¡¡¡¡½¥½¥½¥¥¥¥¥¥¥¥¥¥which which which which which which which which which which which which which which which which which which which which which which which which which which which which which which which which which which which which which which which which which which which which which which which中国文字。 我在python shell上试过这些
string='ÏÂÔØ¡¢°²×°¡¢¸´ÖÆ¡¢·ÃÎÊ¡¢µ¥»÷¡°½ÓÊÜ¡±°´Å¥£¬»òÒÔÆäËû·½Ê½Ê¹ÓóÌÐò'
a= string.decode('iso-8859-1')
b=a.encode('utf-8')
print b
和
print( string.decode('iso-8859-1').encode('utf-8'))
和
source_encoding = "iso-8859-1"
string = string.encode(source_encoding)
string = unicode(string, 'utf-8')
但是我无法在屏幕上看到中文字符,而是看到了'à à ⠡¢A·ÃÃᢥÂμûA·Â¡Â½ÃÃá±A°Ã'Â壬»²ð ÃÃäÃûA·Â½Ã½Ã¹ÃóÃÃò” 我使用了一个在线工具http://www.mdbg.net/chindict/chindict.php,我选择了当前编码:iso-8859-1和原始编码GB18030,我看到结果为国际程序许可协议......
任何人都可以建议我使用python命令以中文显示这些字符串吗? 提前致谢
答案 0 :(得分:1)
正如在线工具所暗示的,file
可能是错误的,因为它只是“猜测”编码。使用gb18030
作为编码可以得到正确的结果:
>>> s = 'ÏÂÔØ¡¢°²×°¡¢¸´ÖÆ¡¢·ÃÎÊ¡¢µ¥»÷¡°½ÓÊÜ¡±°´Å¥£¬»òÒÔÆäËû·½Ê½Ê¹ÓóÌÐò'
>>> print s.decode('gb18030')
下载、安装、复制、访问、单击“接受”按钮,或以其他方式使用程序