需要打开文本文件,并找到另一个文件中给定名称的出现次数。程序应写名称;计数对,用分号分隔成.csv格式的文件
它应该像这样:
简; 77
赫克托; 34
安娜; 39
...
试图使用“ Counter”,但它看起来像一个列表,所以我认为这是执行任务的错误方法
import re
import collections
from collections import Counter
wanted = re.findall('\w+', open('iliadcounts.csv').read().lower())
cnt = Counter()
words = re.findall('\w+', open('pg6130.txt').read().lower())
for word in words:
if word in wanted:
cnt[word] += 1
print (cnt)
但这绝对不是此任务的正确代码...
答案 0 :(得分:1)
您可以一次将整个单词列表提供给Counter,它将为您计数。
然后,可以通过对其进行迭代来仅打印wanted
中的单词:
import re
import collections
from collections import Counter
# create some demo data as I do not have your data at hand - uses your filenames
def create_demo_files():
with open('iliadcounts.csv',"w") as f:
f.write("hug,crane,box")
with open('pg6130.txt',"w") as f:
f.write("hug,shoe,blues,crane,crane,box,box,box,wood")
create_demo_files()
# work with your files
with open('iliadcounts.csv') as f:
wanted = re.findall('\w+', f.read().lower())
with open('pg6130.txt') as f:
cnt = Counter( re.findall('\w+', f.read().lower()) )
# printed output for all words in wanted (all words are counted)
for word in wanted:
print("{}; {}".format(word, cnt.get(word)))
# would work as well:
# https://docs.python.org/3/library/string.html#string-formatting
# print(f"{word}; {cnt.get(word)}")
输出:
hug; 1
crane; 2
box; 3
或者您可以打印整个计数器:
print(cnt)
输出:
Counter({'box': 3, 'crane': 2, 'hug': 1, 'shoe': 1, 'blues': 1, 'wood': 1})
链接: