计算日志中的数据

时间:2017-01-18 11:10:06

标签: awk grep pattern-matching

我有一个日志文件,格式如下:

 1: 2017-01-17 00:00:00,723 - [INFO] gid: 123456787  type: A
 2: 2017-01-17 00:00:00,727 - [INFO] gid: 123456787  Trans: 178
 3: 2017-01-17 00:00:00,729 - [INFO] gid: 123456788  type: B
 4: 2017-01-17 00:00:00,731 - [INFO] gid: 123456788  Trans: 121
 5: 2017-01-17 00:00:00,751 - [INFO] gid: 123456789  type: C
 6: 2017-01-17 00:00:00,771 - [INFO] gid: 123456790  type: D
 7: 2017-01-17 00:00:00,787 - [INFO] gid: 123456790  Trans: 121
 8: 2017-01-17 00:00:00,778 - [INFO] gid: 123456791  type: C
 9: 2017-01-17 00:00:00,789 - [INFO] gid: 123456791  Trans: 150

我的目标是按类型计算Trans组的总数。我的想法是每02行合并,然后使用类型的关键字grep。

$ cat logfile.txt |awk 'ORS=NR%2?FS:RS'|grep A 
2017-01-17 00:00:00,723 - [INFO] gid: 123456787  type: A 2017-01-17 00:00:00,727 - [INFO] gid: 123456787  Total: 178

$cat logfile.txt |awk 'ORS=NR%2?FS:RS'|grep C
2017-01-17 00:00:00,751 - [INFO] gid: 123456789  type: C 2017-01-17 00:00:00,771 - [INFO] gid: 123456790  type: D

预期产出:

$ cat logfile.txt |awk 'ORS=NR%2?FS:RS'|grep B|awk '{sum+=$16} END {print sum}
121

不幸的是,日志包含type行而没有下一行Trans(第5行)。

任何想法都可以帮助我完成目标。

1 个答案:

答案 0 :(得分:3)

没有必要完成所有这些ORS魔法:只需存储找到的最后一个类型并使我们成为一个数组来跟踪每个值出现的值

利用有用的数据作为行尾的最后一个单词,并使用$NF提取它:

awk '$NF ~ /^[0-9]+$/ {          # if last field is a digit
         data[type]+=$NF; next   # make the addition to this value
     }
     {type=$NF}                  # otherwise, pick the type value

     # finally, loop through the array and print the data
     END {for (i in data)        
          print i, data[i]}' file

使用您的给定文件:

$ awk '$NF ~ /^[0-9]+$/ {data[type]+=$NF; next} {type=$NF} END {for (i in data) print i, data[i]}' f
A 178
B 121
C 150
D 121