Question

我正在尝试将.txt文件中的行提交到google translate api，然后将这些结果输出到单独的.txt文件中。一切正常，除了当我读取输出文件时，它是unicode，所以我最终得到像/ xeda这样的字符。我想在写入文件之前将结果转换为utf-8，但我的尝试似乎没有任何效果。我没有错误，但我仍然得到垃圾字符。这就是我的（相关）代码：

read_array = []
write_array = []
write_file = 'write_file.txt'
read_file = open('metaphors1.txt','r')
s = codecs.open('write_file.txt', 'w', 'utf-8')

for line in read_file:
    #Reads sentences from the input file, converts them to a string with
    #all lowercase letters (to prevent garbage values then puts the strings
    #in an array
    readstring = str(line)
    readstring = readstring.lower()
    read_array.append(readstring)

for item in read_array:
    #removes new line symbols to prevent translation errors then submits
    #sentences in the array to the translator, then writes the sentences
    #to a new array
    readitem = str(item)
    readitem.rstrip('\n')
    results1 = translator.translate(readitem)
    resultstring = str(results1)
    write_array.append(resultstring)

for item in write_array:
    #writes the results to an output file
    writeitem = str(item)
    writeitem = writeitem.encode('utf-8')
    s.write("%s\n" % writeitem)

s.close()

我确信无论我做错什么都是简单而明显的，但我对此感到难过。任何帮助，将不胜感激。谢谢！

Answer 1

结帐http://docs.python.org/2/library/stdtypes.html#str.decode，如果您不关心错误，甚至可以告诉它忽略错误。

line.decode（'utf-8'，'ignore'）

在Python中写入.txt文件时，尝试将输出字符串转换为UTF-8

1 个答案: