我正在寻找解析csv文件并聚合2列。
csv文件中的数据:
'IP Address', Severity
10.0.0.1, High
10.0.0.1, High
10.0.0.1, Low
10.0.0.1, Medium
10.0.0.2, Medium
10.0.0.2, High
10.0.0.2, Low
10.0.0.3, Medium
10.0.0.3, High
10.0.0.3, Medium
我希望获得以下行的输出:
'IP Address', Severity
10.0.0.1, High:2, Medium:1, Low:1
10.0.0.2, High:1, Medium:1, Low:1
10.0.0.3, High:1, Medium:2, Low:0
或(理想情况下)
'IP Address', High, Medium, Low
10.0.0.1, 2, 1, 1
10.0.0.2, 1, 1, 1
10.0.0.3, 1, 2, 0
我最近的来到这里: Parse CSV file and aggregate the values
我似乎无法聚合字符串(Severity)变量。
如何输出这些数据?
感谢任何帮助。
答案 0 :(得分:1)
import csv
from collections import defaultdict
with open('text.txt') as f, open('ofile.csv','w+') as g:
reader,writer = csv.reader(f), csv.writer(g)
results = defaultdict(list)
next(reader) #skip header line
for ip,severity in reader:
results[ip].append(severity)
writer.writerow(["'IP Adress'"," High"," Medium"," Low"]) #Write headers
for ip,severities in sorted(results.iteritems()):
writer.writerow([ip]+[severities.count(t) for t in [" High"," Medium"," Low"]])
产地:
'IP Adress', High, Medium, Low
10.0.0.1,2,1,1
10.0.0.2,1,1,1
10.0.0.3,1,2,0
答案 1 :(得分:1)
这是我的解决方案,ag.py:
import collections
import csv
import sys
output = collections.defaultdict(collections.Counter)
with open(sys.argv[1]) as infile:
reader = csv.reader(infile)
reader.next() # Skip header line
for ip,level in reader:
level = level.strip() # Remove surrounding spaces
output[ip][level] += 1
print "'IP Address',High,Medium,Low"
for ip, count in output.items():
print '{0},{1[High]},{1[Medium]},{1[Low]}'.format(ip, count)
要运行解决方案,请发出以下命令:
python ag.py data.csv
output
是一个字典,其键是IP,值是collections.Counter
个对象。