Question

try:
    data=open('info.txt')
    for each_line in data:
        try:
            (role,line_spoken)=each_line.split(':',1)
            print(role,end='')
            print(' said: ',end='')
            print(line_spoken,end='')
        except ValueError:
            print(each_line)
    data.close()
except IOError:
     print("File is missing")

当逐行打印文件时，代码往往会在前面添加三个不必要的字符，即“ï»¿”。

实际输出：

ï»¿Man said:  Is this the right room for an argument?
Other Man said:  I've told you once.
Man said:  No you haven't!
Other Man said:  Yes I have.

预期输出：

Man said:  Is this the right room for an argument?
Other Man said:  I've told you once.
Man said:  No you haven't!
Other Man said:  Yes I have.

Answer 1

我找不到Python 3的副本，它处理编码的方式与Python 2不同。所以这里是答案：不是用默认编码打开文件（'utf-8'），而是使用{ {1}}，期望并剥离UTF-8 Byte Order Mark，这就是显示为'utf-8-sig'的内容。

即代替

ï»¿

待办事项

data = open('info.txt')

请注意，如果您使用的是Python 2，则应该看到Python, Encoding output to UTF-8和Convert UTF-8 with BOM to UTF-8 with no BOM in Python。你需要使用data = open('info.txt', encoding='utf-8-sig')或codecs做一些恶作剧才能在Python 2中正常工作。但是在Python 3中，你需要做的就是设置str.decode参数时你打开文件。

Answer 2

处理excel csv文件时遇到了类似的问题。最初我将文件从下拉选项中保存为.csv utf-8（逗号分隔）文件。然后我把它保存为.csv（逗号分隔）文件，一切都很顺利。也许.txt文件可能存在类似的问题

Answer 3

发生这种情况时，它只发生在CSV的第一行，即读取和写入。对于我正在做的事情，我只是在第一个位置输入了“牺牲”条目，这样那些角色将被添加到我的牺牲条目中，而不是我关心的任何角色。 Definitley并不是一个可靠的解决方案，但是它很快并且可以满足我的目的。

为什么我的Python代码在从文本文件中读取时会打印额外的字符“ï»¿”？

3 个答案: