我有一个外部csv文件,我想计算出现在同一行中的相同单词的出现次数 我的csv看起来像这样
Timestamp,Destination,Source
2015-05-25,A,B
2015-05-25,A,B
2015-05-25,A,B
2015-05-25,C,D
2015-05-25,C,D
2015-05-25,E,F
2015-05-25,E,F
2015-05-25,E,F
2015-05-25,E,F
在上面的csv文件中,单词A,B看起来彼此通信3次C,D 2次和E,F 4次,所以我想使用python在csv文件中写入此信息。
它应该看起来像这样
Destination,Source,Counts
A,B,3
C,D,2
E,F,4
答案 0 :(得分:4)
使用collections.Counter
,您可以轻松计算单词的出现次数。
import csv
from collections import Counter
with open('words.csv') as f:
next(f) # skip header
occurrence = Counter(tuple(row[1:3]) for row in csv.reader(f))
with open('occurrence.csv', 'w') as f:
writer = csv.writer(f)
writer.writerow(['Destination', 'Source', 'Counts'])
for (dest, src), cnt in occurrence.items():
writer.writerow([dest, src, cnt])
答案 1 :(得分:1)
您拥有previous question的几乎所有逻辑,您只需要编写项目:
import csv
from collections import Counter
from itertools import imap
from operator import itemgetter
with open('in.csv') as f, open("out.csv", "w") as o:
wr = csv.writer(o)
next(f)
o.write("Destination,Source,Counts")
wr.writerows([a, b, c] for (a, b), c in
Counter(imap(itemgetter(1, 2), csv.reader(f))).iteritems())
out.csv:
Destination,Source,Counts
A,B,3
C,D,2
E,F,4