Question

我的文件中有一定的模式如下：

....
BEGIN
any text1
any text2
END
....
BEGIN
any text3
garbage text
any text4
END
....
BEGIN
any text5
any text6
END
...

BEGIN和END是我的标记，我只想在标记不包含'garbage text＆＃39;时提取标记之间的所有文本。所以我的期望是提取打击块：

any text1
any text2

any text5
any text6

我如何在awk中执行此操作？我知道我可以这样做：

awk '/BEGIN/{f=1;next}/END/{f=0;}f' file.log

提取两个标记之间的线条，但如何根据缺少＆{39} garbage text＆＃39;？

进一步过滤进一步细化结果

Answer 1

$ awk '/END/{if (rec !~ /garbage text/) print rec} {rec=rec $0 ORS} /BEGIN/{rec=""}' file
any text1
any text2

any text5
any text6

以上假设每个END都与前一个BEGIN配对。对于多字符RS的GNU awk，您可以选择：

$ awk -v RS='END\n' '{sub(/.*BEGIN\n/,"")} RT!="" && !/garbage text/' file
any text1
any text2

any text5
any text6

btw而不是：

awk '/BEGIN/{f=1;next}/END/{f=0;}f' file.log

您的原始代码应该只是：

awk '/END/{f=0} f; /BEGIN/{f=1}' file.log