我需要编写一个函数来计算文件中的所有单词并打印单词的平均长度。 (必须删除标点符号。)
def average(fileName):
infile = open(fileName,'r')
wordcount = {}
for word in infile.read().split():
if word not in wordcount:
wordcount[word] = 1
else:
wordcount[word] += 1
答案 0 :(得分:0)
如果在运行for循环后已经有了workcount数组,则可以获得单词数。 我认为下一步是计算文本文件中的字母。
with open('text.txt') as counting:
print Counter(letter for line in counting
for letter in line.lower()
if letter in ascii_lowercase)
之后,您可以获得所需的平均长度。
答案 1 :(得分:0)
如果我理解正确的话:
import re
non_word_chars = re.compile('\W+')
nr_of_words = 0
total_length = 0
with open('test.txt') as f:
for word in f.read().split(" "):
word = non_word_chars.sub('', word)
nr_of_words += 1
total_length += len(word)
print(round(total_length / nr_of_words))
时间和内存都很有效率,因为它不涉及构建一个字典并再次运行它来计算平均值。