我有脚本打开并读取文本文件,分隔每个单词并列出这些单词。我让Counter从列表中计算每个单词的次数。然后我想在.csv文件中导出每一行:
单词hello出现10次
字室出现5次
单词树出现3次
......等等
您能告诉我在这里需要更改什么才能让脚本正常工作吗?
from collections import Counter
import re
import csv
cnt = Counter()
writefile = open('test1.csv', 'wb')
writer = csv.writer(writefile)
with open('screenplay.txt') as file: #Open .txt file with text
text = file.read().lower()
file.close()
text = re.sub('[^a-z\ \']+', " ", text)
words = list(text.split()) #Making list of each word
for word in words:
cnt[word] += 1 #Counting how many times word appear
for key, count in cnt.iteritems():
key = text
writer.writerow([cnt[word]])
答案 0 :(得分:1)
最大的问题是,每个单词的每次出现都会发生第二个for循环,而不是每个单词都出现一次。您将需要对整个循环进行修改,以便在完成计数后执行。尝试这样的事情:
from collections import Counter
import re
import csv
cnt = Counter()
writefile = open('test1.csv', 'wb')
writer = csv.writer(writefile)
with open('screenplay.txt') as file:
text = file.read().lower()
text = re.sub('[^a-z\ \']+', " ", text)
words = list(text.split())
for word in words:
cnt[word] += 1
for key, count in cnt.iteritems(): #De-dent this block
writer.writerow([key,count]) #Output both the key and the count
writefile.close() #Make sure to close your file to guarantee it gets flushed