使用如下文本处理日志文件的程序。 请帮助您了解如何打印组件列表(在日期和时间之后),根据日志中消息的重要性(第一个单词)排列它们。
例如,组件A应该在组件B之前的列表中,如果它有更多具有最重要级别的消息。
ERROR - 2015 Dec 28 14:48:30 - unfulminating_deacon - 55 - airtightly unintelligently appropriable arlen
INFO - 2015 Dec 28 02:02:56 - mangiest_ima - 144 - overrealistically decadently unfierce edris
CRITICAL - 2015 Dec 27 20:04:02 - unanticipated_konnor - 44 - amusively sensationally turbanlike rico
INFO - 2015 Dec 28 08:12:06 - unfulminating_deacon - 123 - eruptively nonmodally sebacic shavonda
CRITICAL - 2015 Dec 28 08:04:27 - unanticipated_konnor - 1213 - unchastely priorly monophyletic cullen
ERROR - 2015 Dec 28 07:39:36 - furnacelike_marlene - 1414 - healthfully flinchingly unbombastic slyvia
DEBUG - 2015 Dec 27 16:44:47 - mangiest_ima - 144 - questingly substitutionally uncompensative jen
ERROR - 2015 Dec 26 17:49:26 - furnacelike_marlene - 1414 - healthfully flinchingly unbombastic slyvia
EXPECTED OUTPUT:
unanticipated_konnor
furnacelike_marlene
unfulminating_deacon
mangiest_ima
我已经制作了一些代码来计算组件的消息频率,但我不确定它是否有用:
from collections import Counter
file = open('C:\\Users\\User\\Downloads\\tasks\\logs\\1.txt', "r+")
warnList = []
for line in file:
warnList.append(line.split(' - ')[2])
res1 = dict(Counter(warnList))
print "Frequency of messages for components: {} \n".format(res1)
file.close()
每一个建议都将受到高度赞赏,
希望得到您的帮助或建议,
提前致谢,
此致
答案 0 :(得分:-1)
我不太确定我是否正确理解了您的问题,但如果您想按重要性排序日志文件,请尝试以下操作:
from __future__ import print_function
import re
import operator
import collections
import pprint as pp
importance = {
'CRITICAL': 0,
'ERROR': 100,
'INFO': 200,
'DEBUG': 300
}
with open('log.log', 'r') as f:
data = f.read().splitlines()
parsed = collections.OrderedDict()
for line in data:
cols = re.split(r'\s+\-\s+', line)
parsed[line] = importance[cols[0]]
for k,v in sorted(parsed.items(), key=operator.itemgetter(1)):
print(k)
输出:
CRITICAL - 2015 Dec 27 20:04:02 - unanticipated_konnor - 44 - amusively sensationally turbanlike rico
CRITICAL - 2015 Dec 28 08:04:27 - unanticipated_konnor - 1213 - unchastely priorly monophyletic cullen
ERROR - 2015 Dec 28 14:48:30 - unfulminating_deacon - 55 - airtightly unintelligently appropriable arlen
ERROR - 2015 Dec 28 07:39:36 - furnacelike_marlene - 1414 - healthfully flinchingly unbombastic slyvia
ERROR - 2015 Dec 26 17:49:26 - furnacelike_marlene - 1414 - healthfully flinchingly unbombastic slyvia
INFO - 2015 Dec 28 02:02:56 - mangiest_ima - 144 - overrealistically decadently unfierce edris
INFO - 2015 Dec 28 08:12:06 - unfulminating_deacon - 123 - eruptively nonmodally sebacic shavonda
DEBUG - 2015 Dec 27 16:44:47 - mangiest_ima - 144 - questingly substitutionally uncompensative jen
如果不是您想要的,请说明您需要什么。
如果您只需要第三列:
from __future__ import print_function
import re
import operator
import collections
import pprint as pp
importance = {
'CRITICAL': 0,
'ERROR': 100,
'INFO': 200,
'DEBUG': 300
}
with open('log.log', 'r') as f:
data = f.read().splitlines()
parsed = collections.OrderedDict()
for line in data:
cols = re.split(r'\s+\-\s+', line)
parsed[cols[2]] = importance[cols[0]]
for k,v in sorted(parsed.items(), key=operator.itemgetter(1)):
print(k)
输出:
unanticipated_konnor
furnacelike_marlene
unfulminating_deacon
mangiest_ima