Python 3.6.1中输出文件中的Unicode符号

时间:2017-04-28 13:05:01

标签: python unicode

我需要将连接错误记录到log.txt。 Windows是俄语。 我的代码:

    # e is a name for "requests.ConnectionError" form Windows if server is not avilable
    # I take error and cut from it text I need and convert it to str
    e_warning = str(e.args[0].reason)
    # I search text I need in string with "re"
    e_lst = re.findall('>:\s(.+)', e_warning)
    # I create string again from list "re" gives me
    e_str = ''.join(e_lst)
    # I Convert string to bytes
    e_str_unicode = codecs.encode(e_str, 'utf-8')
    # It is a message to warning window
    e_str_utf = codecs.decode(e_str_unicode, encoding='utf-8')
    messagebox.showerror(title='Connection error', message=e_str)
        with codecs.open('log.txt', 'a', encoding='utf-8') as log:
        log.write(strftime(str("%H:%M:%S %Y-%m-%d") + str(e_str_unicode) + '\n'))

如果我使用" e_str_utf"在最后一行,它给了我:

UnicodeEncodeError: 'locale' codec can't encode character '\u041f' in position 72: Illegal byte sequence

有道理 - 72是俄罗斯的第一封信。 如果我使用" e_str_unicode"在最后一行它没有错误,但在日志文件中我看到:

15:25:18 2017-04-28b'Failed to establish a new connection: [WinError 10060] \xd0\x9f\xd0\xbe\xd0\xbf\xd1\x8b\xd1\x82\xd0\xba\xd0\xb0 \xd1\x83\xd1\x81\xd1\x82\xd0\xb0\xd0\xbd\xd0\xbe\xd0\xb2\xd0\xb8\xd1\x82\xd1\x8c \xd1\x81\xd0\xbe\xd0\xb5\xd0\xb4\xd0\xb8\xd0\xbd\xd0\xb5\xd0\xbd\xd0\xb8\xd0\xb5 \xd0\xb1\xd1\x8b\xd0\xbb\xd0\xb0 \xd0\xb1\xd0\xb5\xd0\xb7\xd1\x83\xd1\x81\xd0\xbf\xd0\xb5\xd1\x88\xd0\xbd\xd0\xbe\xd0\xb9, \xd1\x82.\xd0\xba. \xd0\xbe\xd1\x82 \xd0\xb4\xd1\x80\xd1\x83\xd0\xb3\xd0\xbe\xd0\xb3\xd0\xbe \xd0\xba\xd0\xbe\xd0\xbc\xd0\xbf\xd1\x8c\xd1\x8e\xd1\x82\xd0\xb5\xd1\x80\xd0\xb0 \xd0\xb7\xd0\xb0 \xd1\x82\xd1\x80\xd0\xb5\xd0\xb1\xd1\x83\xd0\xb5\xd0\xbc\xd0\xbe\xd0\xb5 \xd0\xb2\xd1\x80\xd0\xb5\xd0\xbc\xd1\x8f \xd0\xbd\xd0\xb5 \xd0\xbf\xd0\xbe\xd0\xbb\xd1\x83\xd1\x87\xd0\xb5\xd0\xbd \xd0\xbd\xd1\x83\xd0\xb6\xd0\xbd\xd1\x8b\xd0\xb9 \xd0\xbe\xd1\x82\xd0\xba\xd0\xbb\xd0\xb8\xd0\xba, \xd0\xb8\xd0\xbb\xd0\xb8 \xd0\xb1\xd1\x8b\xd0\xbb\xd0\xbe \xd1\x80\xd0\xb0\xd0\xb7\xd0\xbe\xd1\x80\xd0\xb2\xd0\xb0\xd0\xbd\xd0\xbe \xd1\x83\xd0\xb6\xd0\xb5 \xd1\x83\xd1\x81\xd1\x82\xd0\xb0\xd0\xbd\xd0\xbe\xd0\xb2\xd0\xbb\xd0\xb5\xd0\xbd\xd0\xbd\xd0\xbe\xd0\xb5 \xd1\x81\xd0\xbe\xd0\xb5\xd0\xb4\xd0\xb8\xd0\xbd\xd0\xb5\xd0\xbd\xd0\xb8\xd0\xb5 \xd0\xb8\xd0\xb7-\xd0\xb7\xd0\xb0 \xd0\xbd\xd0\xb5\xd0\xb2\xd0\xb5\xd1\x80\xd0\xbd\xd0\xbe\xd0\xb3\xd0\xbe \xd0\xbe\xd1\x82\xd0\xba\xd0\xbb\xd0\xb8\xd0\xba\xd0\xb0 \xd1\x83\xd0\xb6\xd0\xb5 \xd0\xbf\xd0\xbe\xd0\xb4\xd0\xba\xd0\xbb\xd1\x8e\xd1\x87\xd0\xb5\xd0\xbd\xd0\xbd\xd0\xbe\xd0\xb3\xd0\xbe \xd0\xba\xd0\xbe\xd0\xbc\xd0\xbf\xd1\x8c\xd1\x8e\xd1\x82\xd0\xb5\xd1\x80\xd0\xb0'

我可以理解 encoding =' utf-8'

with codecs.open('log.txt', 'a', encoding='utf-8') as log:

应该在我的文件中保存utf-8代码中的UNICODE字节,但由于某种原因它忽略了编码设置......为什么?

1 个答案:

答案 0 :(得分:0)

首先:什么是编解码器codecs.open('log.txt', 'a', encoding='utf-8')

第二:这不对strftime(str("%H:%M:%S %Y-%m-%d") + str(e_str_unicode) + '\n')它应该是strftime("%H:%M:%S %Y-%m-%d") + e_str_unicode + '\n'

这是一个简单的例子:

from time import strftime
text = input()
print(text)

with open('log.text', 'a', encoding='utf-8') as log:
    message = strftime("%H:%M:%S %Y-%m-%d") + '=>' + text + '\n'
    log.write(message)