这是用于计数文件中单词的有效代码,但是 问题是(result.csv)仅包含最后一个结果,而不是所有结果。
修复后的代码应该是什么样?
谢谢
import re
import string
frequency = {}
out_filename = "result.csv"
headers = "word,requency \n"
document_text = open('joined.xml', 'r')
text_string = document_text.read().lower()
match_pattern = re.findall(r'\b[a-z]{3,15}\b', text_string)
for word in match_pattern:
count = frequency.get(word,0)
frequency[word] = count + 1
frequency_list = frequency.keys()
for words in frequency_list:
print(words, frequency[words])
with open(out_filename, "w") as fw:
fw.write(headers)
fw.write(words + ", " + str(frequency[words]) + "\n")
答案 0 :(得分:1)
您应该遍历所有的词频对,并将每个对写到单独的行中。
with open(out_filename, "w") as fw:
fw.write(headers)
for word, freq in frequency.items():
fw.write(word + ", " + str(freq) + "\n")