我正在尝试计算CSV文件中最常见的值,并在CSV文件中的每个项目旁边附加出现值。例如:
CSV文件:
* 8 Values in Column 1*
HelloWorld
HelloWorld
HelloSaturn
HelloMars
HelloPluto
HelloSaturn
HelloMoon
HelloMoon
Python代码计算最常见的:
#Removed Code - Take each row in CSV and append to list#
#Create new list, count common occurrences out of 8 items
newList = []
counter = collections.Counter(newList)
d = counter.most_common(8)
print d
印刷输出(已计算上述CSV中最常见的值,例如有两个'HelloWorld'):
[('HelloWorld', 2), ('HelloMars', 1), ('HelloSaturn', 2), ('HelloPluto', 1), ('HelloMoon', 2)]
我现在正尝试将这些值附加/插入到每个值旁边的CSV文件中,例如:
* 8 Values in Column 1* *Occurrence*
HelloWorld 2
HelloWorld 2
HelloSaturn 2
HelloMars 1
HelloPluto 1
HelloSaturn 2
HelloMoon 2
HelloMoon 2
我该怎么做?
答案 0 :(得分:2)
您需要使用csv.writer对象重写CSV文件:
代码看起来像这样(完全未经测试):
import csv
list_of_rows = list()
with open(filename) as fin:
reader = csv.reader(fin)
for row in reader:
list_of_rows.append(row)
# calculate frequency of occurrence
counter = ...
with open(filename, "w") as fout:
writer = csv.writer(fout)
for row in counter.most_common(8):
# row is now (word, frequency)
writer.writerow(row)
答案 1 :(得分:1)
import csv
# I fake here the opening and extracting from a CSV file
# to obtain a list of the words of the first column
ss = """HelloWorld
HelloWorld
HelloSaturn
HelloMars
HelloPluto
HelloSaturn
HelloMoon
HelloMoon"""
column = ss.splitlines()
# Now, the counting
from collections import Counter
c = Counter(column)
# Seeing the counter we got
print '\n'.join(c)
# Putting the result in a CSV file
with open('resu.csv','wb') as g:
gw = csv.writer(g)
gw.writerows([item,c[item]] for item in column)