使用awk解析和转换以下日志

时间:2019-07-08 04:51:45

标签: linux awk

我有这样的日志:

DEBUG: Worker thread (#12) initialized
DEBUG: Worker thread (#19) initialized
DEBUG: Worker thread (#9) initialized
DEBUG: Worker thread (#15) initialized
DEBUG: Worker thread (#3) initialized
DEBUG: Worker thread (#17) initialized
DEBUG: Worker thread (#14) initialized
DEBUG: Worker thread (#16) initialized
Threads started!

[ 5s ] thds: 20 tps: 35265.85 qps: 35265.85 (r/w/o: 0.00/35265.85/0.00) lat (ms,99%): 2.52 err/s: 0.00 reconn/s: 0.00
[ 10s ] thds: 20 tps: 35965.67 qps: 35965.67 (r/w/o: 0.00/35965.67/0.00) lat (ms,99%): 2.03 err/s: 0.00 reconn/s: 0.00
...

我想解析此日志文件并获取以下所有行:

[ 5s ] thds: 20 tps: 35265.85 qps: 35265.85 (r/w/o: 0.00/35265.85/0.00) lat (ms,99%): 2.52 err/s: 0.00 reconn/s: 0.00
[ 10s ] thds: 20 tps: 35965.67 qps: 35965.67 (r/w/o: 0.00/35965.67/0.00) lat (ms,99%): 2.03 err/s: 0.00 reconn/s: 0.00
....

然后我想将这些线条转换为以下格式以进行绘制:

5,35265.85
10,35965.67
...

这是我的awk代码:

#!/usr/bin/env bash
awk '
BEGIN {
printf "#time,tps\n";
}
/^\[\ [0-9]{1,4}[s]?\ \]/ { # regex for [ 1050s ]
printf "%s,%s\n", substr($2,1, length($2)-1), $7
}
' "$@"

对于此解决方案,我不喜欢的是:我必须手动计算awk生成的令牌索引。我更喜欢一个更好的解决方案,例如:“字符串“ tps”之后的第一个标记”。这样,它将更加通用并且更容易解析。

我的问题是:我真的可以使用awk做到这一点吗?还是有更好的解决方案来处理我的情况?

3 个答案:

答案 0 :(得分:2)

这是执行此操作的一种方法。假设您的日志文件名为data.txt。您可以运行以下

cat data.txt | grep -wE "5s|10s" | awk '{print substr($(NF-16), 1, length($(NF-16))-1) "," $(NF-13) "," $(NF-11) "," $(NF-9)}' 

说明

  1. cat <filename>将文件内容打印到标准输出中
  2. grep -wE <exp>过滤cat的输出,并选择包含表达式的行,在我们的例子中为5s or 10s-w确保仅选择与整个单词匹配的那些行,或者确保5s而没有-w的行也选择15s, 20s ..等。

这将选择运行awk的以下行

[ 5s ] thds: 20 tps: 35265.85 qps: 35265.85 (r/w/o: 0.00/35265.85/0.00) lat (ms,99%): 2.52 err/s: 0.00 reconn/s: 0.00
[ 10s ] thds: 20 tps: 35965.67 qps: 35965.67 (r/w/o: 0.00/35965.67/0.00) lat (ms,99%): 2.03 err/s: 0.00 reconn/s: 0.00
  1. 使用awk,我们可以使用NF来找出awk '{print NF}'每行中的字段数,即18

提取第NF-16NF-13NF-11和第NF-9位的相应内容。即分别是第二,第五,第七和第九个位置。但是,第二个位置是5s|10s等。您想删除尾部s,可以通过substr($2, 1, length($2)-1)完成,即从第一个字符到5s / 10s的长度,即2 / 3,并使用-1删除最后一个字符。

您的最终命令是

awk '{print substr($(NF-16), 1, length($(NF-16))-1) "," $(NF-13) "," $(NF-11) "," $(NF-9)}'

,可以替换为

awk '{print substr($2, 1, length($2)-1)","$5","$7","$9}'

将所有内容放在一起

cat data.txt | grep -wE "5s|10s" | awk '{print substr($2, 1, length($2)-1)","$5","$7","$9}'

答案 1 :(得分:1)

使用tr和awk:

tr -cd '0-9 .\n' <file | awk 'NF>1 && NF=4' OFS=","
  

使用tr删除文件中除数字0-9,空格,点和换行符以外的所有字符,并将其余字符输出到awk命令。如果这样,那么一行包含多于一列(NF>1),则将列数减少为四(NF=4)。

输出:

5,20,35265.85,35265.85
10,20,35965.67,35965.67
15,20,35233.82,35233.82
20,20,35239.05,35239.25
25,20,37188.61,37188.41
30,20,36622.32,36622.32
35,20,36538.27,36538.27

请参阅:8 Powerful Awk Built-in Variables – FS, OFS, RS, ORS, NR, NF, FILENAME, FNR

答案 2 :(得分:1)

这是您要做什么吗?

$ awk -v OFS=',' '/^\[/{print $2+0, $5, $7, $9}' file
5,20,35265.85,35265.85
10,20,35965.67,35965.67
15,20,35233.82,35233.82
20,20,35239.05,35239.25
25,20,37188.61,37188.41
30,20,36622.32,36622.32
35,20,36538.27,36538.27

或者如果想要标题,也可以这样:

awk -F'[ :]+' -v OFS=',' '/^\[/{ if (!doneHdr++) print "time", $4, $6, $8; print $2+0, $5, $7, $9}' file
time,thds,tps,qps
5,20,35265.85,35265.85
10,20,35965.67,35965.67
15,20,35233.82,35233.82
20,20,35239.05,35239.25
25,20,37188.61,37188.41
30,20,36622.32,36622.32
35,20,36538.27,36538.27

或者这个:

$ awk -F'[ :]+' -v OFS=',' -v tgts='time thds tps qps' '
    BEGIN {
        numTags = split(tgts,tags)
        for (tagNr=1; tagNr<=numTags; tagNr++) {
            printf "%s%s", tags[tagNr], (tagNr<numTags ? OFS : ORS)
        }
    }
    /^\[/ {
        for (i=1; i<=NF; i++) {
            f[$i] = $(i+1)
            sub(/[^0-9]+$/,"",f[$i])
        }
        f["time"] = f["["]

        for (tagNr=1; tagNr<=numTags; tagNr++) {
            printf "%s%s", f[tags[tagNr]], (tagNr<numTags ? OFS : ORS)
        }
    }
' file
time,thds,tps,qps
5,20,35265.85,35265.85
10,20,35965.67,35965.67
15,20,35233.82,35233.82
20,20,35239.05,35239.25
25,20,37188.61,37188.41
30,20,36622.32,36622.32
35,20,36538.27,36538.27

我使用您的原始示例输入来运行以上操作:

$ cat file
DEBUG: Worker thread (#12) initialized
DEBUG: Worker thread (#19) initialized
DEBUG: Worker thread (#9) initialized
DEBUG: Worker thread (#15) initialized
DEBUG: Worker thread (#3) initialized
DEBUG: Worker thread (#17) initialized
DEBUG: Worker thread (#14) initialized
DEBUG: Worker thread (#16) initialized
Threads started!

[ 5s ] thds: 20 tps: 35265.85 qps: 35265.85 (r/w/o: 0.00/35265.85/0.00) lat (ms,99%): 2.52 err/s: 0.00 reconn/s: 0.00
[ 10s ] thds: 20 tps: 35965.67 qps: 35965.67 (r/w/o: 0.00/35965.67/0.00) lat (ms,99%): 2.03 err/s: 0.00 reconn/s: 0.00
[ 15s ] thds: 20 tps: 35233.82 qps: 35233.82 (r/w/o: 0.00/35233.82/0.00) lat (ms,99%): 2.26 err/s: 0.00 reconn/s: 0.00
[ 20s ] thds: 20 tps: 35239.05 qps: 35239.25 (r/w/o: 0.00/35239.25/0.00) lat (ms,99%): 2.11 err/s: 0.00 reconn/s: 0.00
[ 25s ] thds: 20 tps: 37188.61 qps: 37188.41 (r/w/o: 0.00/37188.41/0.00) lat (ms,99%): 1.86 err/s: 0.00 reconn/s: 0.00
[ 30s ] thds: 20 tps: 36622.32 qps: 36622.32 (r/w/o: 0.00/36622.32/0.00) lat (ms,99%): 1.96 err/s: 0.00 reconn/s: 0.00
[ 35s ] thds: 20 tps: 36538.27 qps: 36538.27 (r/w/o: 0.00/36538.27/0.00) lat (ms,99%): 2.00 err/s: 0.00 reconn/s: 0.00