Question

我是python的新手，它尝试做一个练习，即打开txt文件，然后读取其中的内容（对于大多数人来说可能很简单，但我会承认自己有些挣扎）。

我打开了文件，并使用.read（）来读取文件。然后，我继续删除所有标点符号的文件。接下来，我创建了一个for循环。在此循环中，我开始使用.split（）并添加到表达式中：单词=单词+ len（字符）单词在循环外之前被定义为0，而字符是在循环开始时被拆分的字符。长话短说，我现在遇到的问题是，不是将整个单词添加到我的柜台上，而是添加了每个字符。我可以做些什么来解决我的for循环中的问题？

my_document = open("book.txt")
readTheDocument = my_document.read
comma = readTheDocument.replace(",", "")
period = comma.replace(".", "")
stripDocument = period.strip()

numberOfWords = 0 

for line in my_document:
splitDocument = line.split()
numberOfWords = numberOfWords + len(splitDocument)


print(numberOfWords)

Answer 1

一种更Python化的方式是使用with：

with open("book.txt") as infile:
    count = len(infile.read().split())

您必须了解，使用.split()并不是真正的语法单词。您将得到类似单词的片段。如果您想要适当的单词，请使用模块nltk：

import nltk
with open("book.txt") as infile:
    count = len(nltk.word_tokenize(infile.read()))

Answer 2

只需打开文件并拆分即可获得字数。

file=open("path/to/file/name.txt","r+")
count=0
for word in file.read().split():
    count = count + 1
print(count)

在python中查找文件中的单词数

2 个答案: