Question

enter image description here我正在尝试打开一个基本的file.txt文件，该文件与我的python解释器位于同一CWD中。

所以我a=open("file.txt","r")

然后我要显示其内容（其中只有一条测试行，如hello world）

所以我content=a.read()

所以你知道，当我输入a时，我有这个提示：

a
<_io.TextIOWrapper name='fichier.txt' mode='r' encoding='UTF-8'>

然后我有一个我不明白的错误。有人对如何解决这个问题有想法吗？

Traceback (most recent call last):
  File "<pyshell#6>", line 1, in <module>
    contenu=a.read()
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc6 in position 15: invalid continuation byte

Answer 1

您的文件可能未采用UTF-8编码。试试：

from chardet import detect

with open("file.txt", "rb") as infile:
    raw = infile.read()

    encoding = detect(raw)['encoding']  
    print(encoding)

Answer 2

您的文件未使用UTF-8编码。编码由用于创建文件的工具控制。确保使用正确的编码。

这是一个例子：

>>> s = 'Sébastien Chabrol'
>>> s.encode('utf8')             # é in UTF-8 is encoded as bytes C3 A9.
b'S\xc3\xa9bastien Chabrol'
>>> s.encode('cp1252')           # é in cp1252 is encoded as byte E9.
b'S\xe9bastien Chabrol'
>>> s.encode('utf8').decode('1252')  # decoding incorrectly can produce wrong characters...
'SÃ©bastien Chabrol'
>>> s.encode('cp1252').decode('utf8') # or just fail.
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 1: invalid continuation byte

如果使用Python 3，则可以在打开文件时提供编码：

a = open('file.txt','r',encoding='utf8')

在Python 2或3上，您还可以使用向后兼容的语法：

import io
a = io.open('file.txt','r',encoding='utf8')

如果您不了解编码，则可以以二进制模式打开以查看原始字节内容，至少可以猜测一下：

a = open('file.txt','rb')
print(a.read())

在此处了解有关Python和编码的更多信息：https://nedbatchelder.com/text/unipain.html

我在理解Python中的错误时遇到问题

2 个答案: