Question

所以，我想让一个程序做两件事：

读一个字
阅读希腊语翻译

然后我制作一个如下所示的新格式："word,translation"并将其写入文件。

所以test.txt文件应该包含"Hello,Γεια"，如果我再次阅读，下一行应该在这个下面。

word=raw_input("Word:\n")  #The Word
translation=raw_input("Translation:\n").decode("utf-8") #The Translation in UTF-8
format=word+","+translation+"\n"
file=open("dict.txt","w")
file.write(format.encode("utf-8"))
file.close()

我得到的错误：

UnicodeDecodeError＆＃39; utf8＆＃39;编解码器无法解码位置0的字节0x82：无效的起始字节

编辑：这是Python 22

Answer 1

虽然python 2支持unicode，但它的输入不会自动解码为unicode。 raw_input返回一个字符串，如果管道输入了ascii以外的其他内容，则会得到编码的字节。诀窍是弄清楚编码是什么。这取决于将数据泵入程序的任何内容。如果它是一个终端，那么sys.stdin.encoding应该告诉你使用什么编码。如果它从一个文件中输入，那么sys.stdin.encoding是无，你只需要知道它是什么。

以下是您问题的解决方案。请注意，即使您编写文件的方法（编码然后写入）有效，codecs模块也会导入一个为您执行此操作的文件对象。

import sys
import codecs

# just randomly picking an encoding.... a command line param may be
# useful if you want to get input from files
_stdin_encoding = sys.stdin.encoding or 'utf-8'

def unicode_input(prompt):
    return raw_input(prompt).decode(_stdin_encoding)

word=unicode_input("Word:\n")  #The Word
translation=unicode_input("Translation:\n")
format=word+","+translation+"\n"
with codecs.open("dict.txt","w") as myfile:
    myfile.write(format)

Python：从raw_input（）＆amp;读取UTF-8。在文件中写入UTF-8

1 个答案: