# -*- coding: utf-8 -*-
d = {}
with open('transl.txt', 'r') as f:
for line in f:
(key, val) = line.split(' = ')
d[key] = val
print d
以下是 transl.txt(编码是ANSI)文件中的内容:
send = button
addr = аддрес
当我运行程序时,我得到了这个输出:
'addr': '\xe0\xe4\xe4\xf0\xe5\xf1', 'send': 'button\n'
答案 0 :(得分:0)
# -*- coding: utf-8 -*-
d = {}
with open('transl.txt', 'r') as f:
for line in f:
(key, val) = line.split(' = ')
d[key] = val.decode("windows-1251")
# now the values contain unicode strings.
# this may or may not be desired. If you need to convert them
# back to byte sequences in a given encoding use `.encode(<encoding-name>)`
# method of unicode string
print d
答案 1 :(得分:0)
您可以使用标准库中的codecs.open()
。
import codecs
d = {}
with codecs.open('transl.txt', encoding='maccyrillic') as f:
for line in f:
(key, val) = line.split(u' = ')
d[key] = val
print d['button']
您可以在文档中找到list of standard codecs。
您的输入看起来像是maccyrillic
。
答案 2 :(得分:-1)
必须是您的终端配置不正确;将LANG和LC_ALL环境变量设置为en_US.UTF-8或您的语言的unicode等效项。