我有一个有时会产生类似结果的工具。但是,这些行并不完全相同,但可以视为一条。所以我需要检查一行中是否有五个日志,如果有,则打印“这是重复的日志”。这需要在顺序的基础上完成,而不是内容,因为可能会有细微的差异。我正在尝试允许5条完整的邮件,而病房中的6条邮件应屏蔽为“已拒绝”
日志如下,实际日志是很长的文本,为简单起见,使用此
-->echo "$x"
09:09:02 a aa1
09:09:03 a aa2
09:09:04 a aa3
09:09:05 a aa4
09:09:06 a aa5
09:09:07 a ssf
09:09:08 a s2
09:09:09 a 243
09:09:10 a 21
09:09:11 a 12
09:09:12 a 21
09:09:13 a 32
09:09:14 a 21
09:09:15 a 12
09:09:16 b 21
09:09:17 b 12
09:09:18 b 12
09:09:19 a 12
09:09:20 a 32
09:09:21 a 32
09:09:22 a 21
09:09:23 a 11
09:09:24 a 23
09:09:25 a 32
09:09:26 a 32
09:09:27 b 21
09:09:28 b 21
09:09:29 b 1
09:09:30 b 1
09:09:31 b 32
09:09:32 b 23
09:09:33 b 21
09:09:34 b 2
09:09:35 b 1
09:09:36 b 3
09:09:37 b 4
09:09:38 b 5
09:09:39 b 6
09:09:40 b 7
09:09:41 b 8
09:09:42 c 9
09:09:43 c 0
09:09:44 c 9
09:09:45 c 8
09:09:46 c 5
预期结果:
09:09:02 a aa1
09:09:03 a aa2
09:09:04 a aa3
09:09:05 a aa4
09:09:06 a aa5
09:09:07 above message is repeated
09:09:08 above message is repeated
09:09:09 above message is repeated
09:09:10 above message is repeated
09:09:11 above message is repeated
09:09:12 above message is repeated
09:09:13 above message is repeated
09:09:14 above message is repeated
09:09:15 above message is repeated
09:09:16 b 21
09:09:17 b 12
09:09:18 b 12
09:09:19 a 12
09:09:20 a 12
09:09:21 a 32
09:09:22 a 32
09:09:23 a 21
09:09:24 above message is repeated
09:09:25 above message is repeated
09:09:26 above message is repeated
09:09:27 b 21
09:09:28 b 21
09:09:29 b 1
09:09:30 b 1
09:09:31 b 32
09:09:32 above message is repeated
09:09:33 above message is repeated
09:09:34 above message is repeated
09:09:35 above message is repeated
09:09:36 above message is repeated
09:09:37 above message is repeated
09:09:38 above message is repeated
09:09:39 above message is repeated
09:09:40 above message is repeated
09:09:41 above message is repeated
09:09:42 c 9
09:09:43 c 0
09:09:44 c 9
09:09:45 c 8
09:09:46 c 5
我正在尝试将它们分成5组,但是它没有打印任何内容
echo "$x" |awk '{input=$2;next}{if(input==$2)c=c+1;if(c<=5)print $0 ;print "above message is repeated"}'
答案 0 :(得分:2)
按照OP的注释,Input_file的第二列已被排序。您能不能试一下。
awk '
prev!=$2{
count=0
}
{
++count
}
count>5{
print $1,"above message is repeated....."
next
}
1
{
prev=$2
}' Input_file
编辑: 根据Tiw的评论和良好的想法添加解决方案,以防有人需要从哪个时间打印到我们需要重复的时间然后尝试遵循。
awk '
prev!=$2 && prev{
if(count>5){
print "Time stamp FROM " start " to " prev_time " Above message repeated " value_count " times."
}
count=value_count=start=prev_time=""
}
{
++count
}
{
prev=$2
prev_time=$1
}
count>5{
start=start?start:$1
value_count++
next
}
1
' Input_file
以上代码的输出如下。
09:09:02 a aa1
09:09:03 a aa2
09:09:04 a aa3
09:09:05 a aa4
09:09:06 a aa5
Time stamp FROM 09:09:07 to 09:09:15 Above message repeated 9 times.
09:09:16 b 21
09:09:17 b 12
09:09:18 b 12
09:09:19 a 12
09:09:20 a 32
09:09:21 a 32
09:09:22 a 21
09:09:23 a 11
Time stamp FROM 09:09:24 to 09:09:26 Above message repeated 3 times.
09:09:27 b 21
09:09:28 b 21
09:09:29 b 1
09:09:30 b 1
09:09:31 b 32
Time stamp FROM 09:09:32 to 09:09:41 Above message repeated 10 times.
09:09:42 c 9
09:09:43 c 0
09:09:44 c 9
09:09:45 c 8
09:09:46 c 5