我一直在尝试编写一个代码来处理我每天处理的各种日志文件。我尝试用bash,perl和python写作,但到目前为止还没那么好..
以下是日志示例:
Table TRKGRP1: New table control.
TRKGRP1: 1000 tuples checked. Tuple checking still in progress...
Completed tuple checking.
SUMMARY: Tbl TRKGRP1: tuples checked 1297, passed 1297, failed 0.
Table TOLLTRKS: New table control.
Completed tuple checking.
SUMMARY: Tbl TOLLTRKS: tuples checked 3, passed 3, failed 1.
Table BRANDOPT: New table control.
Completed tuple checking.
SUMMARY: Tbl BRANDOPT: tuples checked 0, passed 0, failed 0.
Table C7UPTMR: New table control.
Completed tuple checking.
SUMMARY: Tbl C7UPTMR: tuples checked 4, passed 4, failed 3.
Table TOPSCOIN: New table control.
Completed tuple checking.
SUMMARY: Tbl TOPSCOIN: tuples checked 0, passed 0, failed 2.
我需要的是从“表”到“失败的1/2/3”的文本部分我只需要捕获以失败1结束,失败2失败的部分3.失败0不需要。请记住,这些日志有时会变得更长或更短,而不是总是3行。
这是预期的输出:
Table TOLLTRKS: New table control. Completed tuple checking. SUMMARY: Tbl TOLLTRKS: tuples checked 3, passed 3, failed 1. Table C7UPTMR: New table control. Completed tuple checking. SUMMARY: Tbl C7UPTMR: tuples checked 4, passed 4, failed 3. Table TOPSCOIN: New table control. Completed tuple checking. SUMMARY: Tbl TOPSCOIN: tuples checked 0, passed 0, failed 2.
如果你们能帮助我,我真的很感激。
答案 0 :(得分:1)
将文件分成多组行,然后从组中提取所需的数据变得微不足道。以下显示如何将文件分成您想要的组。
将整个文件放在一个变量中时:
while ($file =~ /\G ( \S[^\n]*\n (?:(?:[^\n\S][^\n]*)?\n)* )/xg) {
process($1);
}
一次读一行:
my $buf;
while (<>) {
if (/^\S/) {
process($buf) if length($buf);
$buf = '';
}
$buf .= $_;
}
process($buf) if length($buf);
process
非常简单。
sub process {
for ($_[0]) {
print
if /^Table /
&& /, failed (\d+)\.$/m
&& $1 > 0;
}
}
答案 1 :(得分:1)
Python--这不是最有效的,但希望算法清晰,并且有效:
text = '''
Table TRKGRP1: New table control.
TRKGRP1: 1000 tuples checked. Tuple checking still in progress...
Completed tuple checking.
SUMMARY: Tbl TRKGRP1: tuples checked 1297, passed 1297, failed 0.
Table TOLLTRKS: New table control.
Completed tuple checking.
SUMMARY: Tbl TOLLTRKS: tuples checked 3, passed 3, failed 1.
Table BRANDOPT: New table control.
Completed tuple checking.
SUMMARY: Tbl BRANDOPT: tuples checked 0, passed 0, failed 0.
Table C7UPTMR: New table control.
Completed tuple checking.
SUMMARY: Tbl C7UPTMR: tuples checked 4, passed 4, failed 3.
Table TOPSCOIN: New table control.
Completed tuple checking.
SUMMARY: Tbl TOPSCOIN: tuples checked 0, passed 0, failed 2.
'''
lines = text.split('\n')
或者,从文件
with open('input.txt') as f:
lines = f.readlines()
f.close()
然后
f = open("output.txt", 'w')
buf = []
show = False
for line in lines:
if line.startswith('Table'):
if show:
f.writelines(buf)
buf = []
show = True
buf.append(line)
if line.find('failed 0') >= 0:
show = False
if show:
f.writelines(buf)
f.close()