Question

我写了一篇文章，想用python计算文章中的文字。我将文章粘贴在python文本文件中并保存。我编写了一个程序来迭代文本文件并计算单词，但它不断给我以下错误："UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 62: character maps to <undefined>"

这是代码：

def counter(file):
    with open(file) as word:
        count=0
        for i in word:
            words=i.split()
            count+=words
        print(count)

文件名是essay.txt

它不会工作。即使我尝试在shell上打开essay.txt也无法正常工作。我尝试了以下方法：

infile = open('essay.txt')
word=infile.read()
print(word)

这也不起作用。我该怎么办？请帮忙。谢谢

Answer 1

尝试

open('essay.txt', encoding ='utf-8')

可以检测到错误的编码类型。如果不是utf-8，请尝试latin1

Answer 2

我试图重新创建你的问题，但我无法这样做。我将essay.txt文件保存为utf-8编码样式，因此可能与您使用的设置不同。对我有用的代码如下。

def counter(file):
    with open(file) as word:
        count=0
        for i in word:
            words=i.split()
            count += len(words)
        print(count)
counter("essay.txt")

我做了一些改变。对于i中的每个word，我相信您希望len()函数返回该行上的字总数。然后，您可以将该行上的单词数添加到文档的总计数中。这对我来说正在使用Python 3.3.0。如果我误解了，请告诉我！

感谢。

Python麻烦阅读文件

2 个答案: