Question

我在Linux上，想要将字符串（在utf-8中）写入txt文件。我尝试了很多方法，但我总是遇到错误：

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position in position 36: ordinal not in range(128)

有什么办法，如何只写ascii字符文件？并忽略非ascii字符。我的代码：

# -*- coding: UTF-8-*-

import os
import sys


def __init__(self, dirname, speaker, file, exportFile):

  text_file = open(exportFile, "a")

  text_file.write(speaker.encode("utf-8"))
  text_file.write(file.encode("utf-8"))

  text_file.close()

谢谢。

Answer 1

尝试使用codecs模块。

# -*- coding: UTF-8-*-

import codecs


def __init__(self, dirname, speaker, file, exportFile):

  with codecs.open(exportFile, "a", 'utf-8') as text_file:
      text_file.write(speaker.encode("utf-8"))
      text_file.write(file.encode("utf-8"))

另外，请注意您的file变量的名称与内置file函数发生冲突。

最后，我建议你看看http://www.joelonsoftware.com/articles/Unicode.html以更好地理解什么是unicode，以及其中一个页面（取决于你的python版本）来理解如何在Python中使用它：

Answer 2

您可以使用codecs模块：

import codecs
text_file = codecs.open(exportFile,mode='a',encoding='utf-8')
text_file.write(...)

Answer 3

您可以在编写之前解码输入字符串;

text = speaker.decode("utf8")
with open(exportFile, "a") as text_file:
    text_file.write(text.encode("utf-8"))
    text_file.write(file.encode("utf-8"))

python-write to file（忽略非ascii字符）

3 个答案: