我需要提取块(在两个词之间:开始和结束)包含其他词(id)的文字。
例如:
2014-07-01 13:26:07,760 Start
2014-07-01 13:26:07,762 id: 456456454
2014-07-01 13:26:07,763 other
2014-07-01 13:26:07,764 End
2014-07-01 13:26:07,764 aaaaaaaa
2014-07-01 13:26:07,764 bbbbbbbb
2014-07-01 13:26:07,765 Start
2014-07-01 13:26:07,765 id: 930939023
2014-07-01 13:26:07,765 something
2014-07-01 13:26:07,766 End
2014-07-01 13:26:07,766 Start
2014-07-01 13:26:07,766 id: 876542
2014-07-01 13:26:07,766 other
2014-07-01 13:26:07,767 End
2014-07-01 13:26:07,767 aaaaaaaa
2014-07-01 13:26:07,767 bbbbbbbb
2014-07-01 13:26:07,767 Start
2014-07-01 13:26:07,767 id: 930939023
2014-07-01 13:26:07,768 something
2014-07-01 13:26:07,768 End
2014-07-01 13:26:07,768 Start
2014-07-01 13:26:07,768 id: 54654
2014-07-01 13:26:07,768 something
2014-07-01 13:26:07,769 End
对于id = 930939023,输出为
2014-07-01 13:26:07,765 Start
2014-07-01 13:26:07,765 id: 930939023
2014-07-01 13:26:07,765 something
2014-07-01 13:26:07,766 End
2014-07-01 13:26:07,767 Start
2014-07-01 13:26:07,767 id: 930939023
2014-07-01 13:26:07,768 something
2014-07-01 13:26:07,768 End
答案 0 :(得分:7)
以下是使用sed
的选项:
sed -n '/Start/{:a;/End/!{N;ba};/930939023/!d;p}' file
sed -n ' # Suppress default printing
/Start/ { # When line contains Start
:a; # Create a label a for loop
/End/! { # Until a line with End is seen
N; # Append the next line to pattern space
ba # Go back to label a and repeat
}
/930939023/!d; # If the appended line contains does not contain id, delete it
p # Else print it
}' file
答案 1 :(得分:1)
您可以使用awk
。因为脚本会变得更复杂,我建议将它存储在一个文件中:
extract.awk:
# Set flag if id was found
/id: 930939023/{f=1}
# On "Start" clear the buffer, reset buffer index i and reset flag
/Start/{b=$0;f=0;next}
# On "End", if the flag was set print the buffer
/End/{
if(f){
print b
print
}
}
# Append all other lines to buffer
# (Lines between start end will get cleared on next "Start")
{b=b"\n"$0}
...并按此执行:
awk -f extract.awk file
输出:
2014-07-01 13:26:07,765 Start
2014-07-01 13:26:07,765 id: 930939023
2014-07-01 13:26:07,765 something
2014-07-01 13:26:07,766 End
2014-07-01 13:26:07,767 Start
2014-07-01 13:26:07,767 id: 930939023
2014-07-01 13:26:07,768 something
2014-07-01 13:26:07,768 End
答案 2 :(得分:1)
你可以尝试下面的awk命令,
$ awk '/Start/ {f=1} /End/ {print;f=0;}f' file | awk -v RS="End" -v ORS="End" '/930939023/'
2014-07-01 13:26:07,765 Start
2014-07-01 13:26:07,765 id: 930939023
2014-07-01 13:26:07,765 something
2014-07-01 13:26:07,766 End
2014-07-01 13:26:07,767 Start
2014-07-01 13:26:07,767 id: 930939023
2014-07-01 13:26:07,768 something
2014-07-01 13:26:07,768 End