我试图在从文件中读取后平均每个单词的长度。但是,文件中的文本没有使用正常的句子结构格式化。有时在单词和句子中的新换行符之间会有一个额外的空格。
当前代码
def average(filename):
with open(filename, "r") as f:
for line in f:
words = line.split()
average = sum(len(words) for words in words)/len(words)
return average
>>>4.3076923076923075
Expected
>>>4.352941176470588
文件
Here are some words there is no punctuation but there are words what
is the average length
答案 0 :(得分:2)
当您以f
打开文件时,请运行
for x in f:
x
将是文件中的每个行,以换行符结束。你得到的答案对于第一行文字是完全正确的。如果您希望第二行包含在第一行中,则需要整体处理文本文件,而不是逐行处理。
假设您想获得文件中所有单词的平均值,以下内容应该会更好一些:
def average(filename):
with open(filename, "r") as f:
lines = [line for line in f]
words = " ".join(lines).split()
average = sum(len(word) for word in words)/len(words)
return average