我有一个来自我的应用程序的日志,如下所示:
{Fri Mar 16 19:07:47 Program: job-a: <blah><blah>
Fri Mar 16 19:07:47 Program: job-a: <blah><blah>
Fri Mar 16 19:07:48 Program: job-b: <blah><blah>
Fri Mar 16 19:07:48 Program: job-b: <blah><blah>
Fri Mar 16 19:07:50 Program: job-b: <blah><blah>
Fri Mar 16 19:07:51 Program: job-b: <blah><blah>
Fri Mar 16 19:07:52 Program: job-a: <blah><blah>
Fri Mar 16 19:07:52 Program: job-a: <blah><blah>
Fri Mar 16 19:07:53 Program: job-a: <blah><blah>
Fri Mar 16 19:07:54 Program: job-a: <blah><blah>
Fri Mar 16 19:07:55 Program: job-a: <blah><blah>
Fri Mar 16 19:08:00 Program: job-a: <blah><blah>
Fri Mar 16 19:08:01 Program: job-a: <blah><blah>
Fri Mar 16 20:33:52 Program: job-c: <blah><blah>
Fri Mar 16 20:45:56 Program: job-c: <blah><blah>}
对于每种作业名称(job-a
,job-b
,job-c
),在这种情况下,我需要找到该行的第一次和最后一次出现,以确定开始和结束时间。
即。我需要输出程序/作业名称,start_time和end_time,如下面的示例输出所示。我已经以逗号分隔显示了预期的输出,但我并不真正关心分隔符,因为我只对这些值感兴趣。忽略开头的花括号,并在样本输入/输出中结束。
job-a, Fri Mar 16 19:07:47, Fri Mar 16 19:08:01
job-b, Fri Mar 16 19:07:48, Fri Mar 16 19:07:51
job-c, Fri Mar 16 20:33:52, Fri Mar 16 20:45:56
答案 0 :(得分:0)
您可以使用awk
,我只是在这里展示如何获得每项工作的第一次和最后一次。
awk '!first[$6]{ first[$6]=$4 } { last[$6]=$4 }
END{ for (x in last) print x, first[x], last[x] }' OFS=', ' infile
job-a:, 19:07:47, 19:08:01
job-b:, 19:07:48, 19:07:51
job-c:, 20:33:52, 20:45:56
答案 1 :(得分:-1)
这是Perl中的一个例子:
use feature qw(say);
use strict;
use warnings;
my $fn = 'log.txt';
open ( my $fh, '<', $fn ) or die "Could not open file '$fn': $!";
my %jobs;
while (my $line = <$fh>) {
chomp $line;
next if $line !~ /job/;
my ($date, $job ) = $line =~ /^(.*?)\s*Program:\s*(job.*?):/;
if (exists $jobs{$job}) {
$jobs{$job}->{end} = $date;
}
else {
$jobs{$job}->{start} = $date;
}
}
close $fh;
for my $job (sort keys %jobs) {
say join ", ", $job, $jobs{$job}->{start}, $jobs{$job}->{end};
}