bash命令根据起始键

时间:2017-06-14 12:26:37

标签: bash sed grep

我有一个日志文件如下。每行记录一些字符串和线程ID。每个主题属于进程进程可以包含N个主题。

基于以下示例,我想提取(使用bash工具,grepsed等等)属于给定进程的所有线程的所有行。请注意,该过程仅在线程序列的顶部提及一次:

line1 thread= 150 process= 200
line2 thread= 152 whatever
line3 thread= 150 whatever
line4 thread= 150 whatever
line5 thread= 130 whatever
line6 thread= 130 process= 200
line7 thread= 150 process= 201
line8 thread= 130 whatever
line9 thread= 130 whatever

对于此示例,请提供进程200,输出应为:

line1 thread= 150 process= 200
line3 thread= 150 whatever
line4 thread= 150 whatever
line6 thread= 130 process= 200
line8 thread= 130 whatever
line9 thread= 130 whatever

1 个答案:

答案 0 :(得分:0)

awk 解决方案:

filter_threads.awk 脚本:

#!/bin/awk -f
function get_thread(s){           # extracts thread number from the string
    t = substr(s,index(s,"=")+1); # considering `=` as separator (e.g. `thread=150`) 
    return t; 
} 
BEGIN { 
    pat = "process="p   # regex pattern to match the line with specified process
}
$3~pat {    # on encountering "process" line
    thread = get_thread($2); print; next   # getting base thread number 
}
{ 
    t = get_thread($2); 
    if (t==thread) print  # comparing current thread numbers with base thread number
}

用法

awk -f filter_threads.awk -v p=200 yourfile

- 其中p进程

输出:

line1 thread=150 process=200
line3 thread=150 whatever
line4 thread=150 whatever
line6 thread=130 process=200
line8 thread=130 whatever
line9 thread=130 whatever

<强> 更新

当您更改初始输入时,新解决方案如下:

awk -v p=200 '$4~/process=/ && $5==p{ thread=$3; print; next }$3==thread{ print }' yourfile