这是我的csv文件:
2017-07-14 03:05:23 B2KPRT320 - Error1
2017-07-14 03:05:23 B2KPRT320 - Error1
2017-07-15 03:05:23 B2KPRT320 - Error2
2017-07-15 03:05:23 B2KPRT320 - Error3
我需要计算每天的错误
到目前为止这是我的脚本:
import collections
Data = []
string = ""
array = []
with open('out.csv') as f:
for line in f:
Data.append([word for word in line.strip().split("\t")])
for item in Data:
try:
date,error = item[0],item[3]
string = date + "\t" + error + "\n"
array.append([word for word in string.strip().split("\t")])
except IndexError:
print "A line in the file doesn't have enough entries."
最后,我需要将结果保存在另一个csv文件中 这个输出:
2017-07-14 - Error1 2
2017-07-15 - Error2 1
2017-07-15 - Error3 1
答案 0 :(得分:0)
您可以将文件读入列表并使用collections.Counter()
计算重复错误,然后split()
每行获取第1个和最后一个项目。例如:
import collections
Data = []
string = ""
array = []
with open('test.txt') as f:
Data = collections.Counter(f.read().splitlines())
for item, c in Data.items():
item = item.split()
date, error = item[0], item[-1]
string = "{}\t{}\t{}".format(date, error, c)
array.append(string)
for elem in array:
print elem
这将输出:
2017-07-15 Error3 1
2017-07-15 Error2 1
2017-07-14 Error1 2
修改强>
您不再需要try/except
,因为使用item[-1]
会为您提供列表的最后一项。相反,您可以使用:
if len(item) < x:
# print error
else:
# the above code