Sed或awk。在两个字符串+和附加标识符之间查找文本

时间:2015-05-05 17:29:45

标签: awk sed

我希望搜索文件并在两个字符串之间提取数据。我能用sed确定这个。但我还需要它只为特定字段提取信息。例如:

2015-04-29T08:05:24.668345-04:00 test1 [S=4444] [SID:1630710955] HOOK_EV

---SYSLOG DATA
2015-04-29T08:05:24.668345-04:00 test1 [S=4445] [SID:1630710956]
2015-04-29T08:05:24.668345-04:00 test1 [S=4444] [SID:1630710955] HOOK_EV_OFF

我的sed语句可以在HOOK_EV和HOOK_EV_OFF字符串之间提取数据。但是我希望它只提取特定SID号的数据。目前它将拉取两个字符串之间的所有数据,但是对于一切。因此,在上面的示例中,我只想在HOOK_EV和HOOK_EV_OFF字符串之间提取SID:1630710955的数据。

可以做所有这些吗?

2 个答案:

答案 0 :(得分:3)

sed -n '/HOOK_EV$/,/HOOK_EV_OFF$/ {/SID:1630710955/p}'

答案 1 :(得分:2)

这是一个awk在线人员:

awk -v sid=1630710955 '/HOOK_EV_OFF$/{flag=0;next}{if(flag && $0 ~ "SID:"sid){print}}/HOOK_EV$/{flag=1;next}' infile

说明:

awk -v sid=1630710955 '/HOOK_EV_OFF$/{flag=0;next} # Final pattern found   --> turn off the flag and read next line
                       {if(flag && $0 ~ "SID:"sid){print}} # if flag and SID pattern in line print it
                       /HOOK_EV$/{flag=1;next} # Initial pattern found --> turn on the flag and read the next line
                       ' infile

对于动态SID提取,您可以使用:

awk '/HOOK_EV_OFF$/{flag=0;SID="";next} 
     flag && $NF==SID
     /HOOK_EV$/{flag=1;SID=$(NF-1);next}' infile

拥有此输入文件:

2015-04-29T08:05:24.668345-04:00 test1 [S=4444] [SID:1630710955] HOOK_EV
2015-04-29T08:05:24.668345-04:00 test1 [S=4445] [SID:1630710955]
2015-04-29T08:05:24.668345-04:00 test1 [S=4445] [SID:1630710956]
2015-04-29T08:05:24.668345-04:00 test1 [S=4444] [SID:1630710955] HOOK_EV_OFF
2015-04-29T08:05:24.668345-04:00 test1 [S=4445] [SID:1630710955]
2015-04-29T08:05:24.668345-04:00 test2 [S=4444] [SID:1630710965] HOOK_EV
2015-04-29T08:05:24.668345-04:00 test2 [S=4447] [SID:1630710965] 
2015-04-29T08:05:24.668345-04:00 test2 [S=4447] [SID:1630710967] 
2015-04-29T08:05:24.668345-04:00 test2 [S=4444] [SID:1630710965] HOOK_EV_OFF

输出将是:

2015-04-29T08:05:24.668345-04:00 test1 [S=4445] [SID:1630710955]
2015-04-29T08:05:24.668345-04:00 test2 [S=4447] [SID:1630710965]