示例日志:
2018-01-01 11:30:22 xxx Parsing xxx
2018-01-01 11:30:23 driver queryId=<xxx> Parsing command: select *
from table
limit 10
2018-01-01 11:30:25 Parsing completed
2018-01-01 11:30:28 xxxxxx
2018-01-01 11:30:40 driver queryId=<xxx> Parsing command: select * from table group by column
2018-01-01 11:30:45 Parsing completed
2018-01-01 11:30:51 xxxxxx
2018-01-01 11:30:52 xxx Parsing xxx
2018-01-01 11:30:54 driver queryId=<xxx> Parsing command: select
*
from table
order by column
limit 20
2018-01-01 11:30:56 Parsing completed
2018-01-01 11:30:59 xxxxxx
我想删除“解析命令:”和“2018”匹配模式之间的换行符,输出应包含仅与模式匹配的单词。
解析示例:
2018-01-01 11:30:54 driver queryId=<xxx> Parsing command: select
*
from table
order by column
limit 20
2018-01-01 11:30:56 Parsing completed
以上示例的输出应为,
select * from table order by column limit 20
答案 0 :(得分:1)
这是使用perl而不是sed / awk的非常简短的解决方案:
perl -ne 's/\n/ /; print +(s/^.*Parsing command: // .. /^2018/ or next) =~ /E/ ? "\n" : $_' input.log
这个想法:
我们遍历输入行(-n
)。对于每一行,我们执行代码(-e ...
):
s/\n/ /
)替换换行符。COND1 .. COND2
条件,对于COND1和COND2之间范围内的所有行都是如此。s/^.*Parsing command: //
,如果它设法删除以Parsing command:
结尾的输入行的某些前缀,则为true。这是我们系列的开始。/^2018/
,如果输入行以2018
开头,则为true。这是我们系列的终点。... or next
)。对于其余的代码,我们只考虑范围内的行。..
返回的值是序列号。范围中的最后一行附加E0
。我们检查/E/
是否排除范围的最后一行(以2018
开头的那一行),因为我们不想打印它。"\n"
),否则我们打印该行(最后一行换行符从第一次换算转换为空格)。答案 1 :(得分:1)
sed
,虽然它看起来有点可怕: - /
sed -nE '/Parsing command:/{
s/^.*Parsing command://;:l1;N;/Parsing completed[[:blank:]]*$/!bl1;
s/2018-.*Parsing completed[[:blank:]]*$//;
s/\n/ /g;s/^[[:blank:]]*//;s/[[:blank:]]+/ /gp}' logfile
请注意,最后两个替换是针对某些细粒度格式设置的,p
标记与最后s
负责打印。功能
输出
select * from table limit 10
select * from table group by column
select * from table order by column limit 20
所有好: - )
推荐阅读: sed
branching声明。
答案 2 :(得分:1)
Awk
解决方案:
awk '/Parsing command:/{ f=1; sub(/.*Parsing command: /,""); q=$0; next }
f && /^2018/{ gsub(/[[:space:]]{2,}/, " ", q); print q; f=0 }
NF && f{ q=q" "$0 }' logfile
输出:
select * from table limit 10
select * from table group by column
select * from table order by column limit 20
答案 3 :(得分:1)
sed脚本:文件extractcommand.sed
:
#!/usr/bin/sed -f
/Parsing command:/!{d;b} # delete+continue if 'Parsing command' not found
:a # if found, then start a loop with label (a)
s/.*Parsing command:\s*// # delete that 'Parsing command'
/Parsing completed/{ # if found 'Parsing completed'
s:\n[^\n]*Parsing completed:: # then delete that 'Parsing completed'
s:\n: :g # change all \n to space
s: *: :g # remove all extra spaces (optional)
b # break the loop (and print as default)
} #
N # load another line into buffer
ba # loop to label (a)
测试:
$ ./extractcommand.sed <sample.log
select * from table limit 10
select * from table group by column
select * from table order by column limit 20
答案 4 :(得分:0)
保持简单。鉴于您的第一个发布的输入文件,使用GNU awk进行多字符RS和RT:
$ awk -F'Parsing command: ' -v RS='[^\n]+Parsing completed' 'RT{gsub(/\s+/," ",$NF); print $NF}' file
select * from table limit 10
select * from table group by column
select * from table order by column limit 20
或与任何awk:
$ cat tst.awk
/Parsing completed/ {
gsub(/ +/," ",buf)
sub(/.*Parsing command: /,"",buf)
print buf
buf = ""
}
{ buf = buf " " $0 }
$ awk -f tst.awk file
select * from table limit 10
select * from table group by column
select * from table order by column limit 20