如何在python中使非英文字母可读?

时间:2016-05-14 11:00:26

标签: python python-2.7 character-encoding

# -*- coding: utf-8 -*-
d = {}
with open('transl.txt', 'r') as f:
    for line in f:  
        (key, val) = line.split(' = ')
        d[key] = val

print d

以下是 transl.txt(编码是ANSI)文件中的内容:

send = button
addr = аддрес

当我运行程序时,我得到了这个输出:

'addr': '\xe0\xe4\xe4\xf0\xe5\xf1', 'send': 'button\n'

3 个答案:

答案 0 :(得分:0)

# -*- coding: utf-8 -*-
d = {}
with open('transl.txt', 'r') as f:
    for line in f:  
        (key, val) = line.split(' = ')
        d[key] = val.decode("windows-1251")
# now the values contain unicode strings.
# this may or may not be desired. If you need to convert them
# back to byte sequences in a given encoding use `.encode(<encoding-name>)`
# method of unicode string
print d

答案 1 :(得分:0)

您可以使用标准库中的codecs.open()

import codecs
d = {}
with codecs.open('transl.txt', encoding='maccyrillic') as f:
    for line in f:  
        (key, val) = line.split(u' = ')
        d[key] = val
print d['button']

您可以在文档中找到list of standard codecs。 您的输入看起来像是maccyrillic

答案 2 :(得分:-1)

必须是您的终端配置不正确;将LANG和LC_ALL环境变量设置为en_US.UTF-8或您的语言的unicode等效项。