我有一个文件,其中包含如下内容。 START
和STOP
代表一个区块。
START
X | 123
Y | abc
Z | +=-
STOP
START
X | 456
Z | +%$
STOP
START
X | 789
Y | ghi
Z | !@#
STOP
我希望以下面的格式为每个块打印X
和Y
的值:
123 ~~ abc
456 ~~
789 ~~ ghi
如果单次出现START
/ STOP
,sed -n '/START/,/STOP/p'
会有所帮助。由于这是重复的,我需要你的帮助。
答案 0 :(得分:2)
对于任何涉及处理多行的问题,sed总是错误的选择。所有sed的神秘结构在1970年代中期发明时都已经过时了。
每当输入中有名称 - 值对时,我发现创建一个数组可以将每个名称映射到它的值,然后通过名称访问数组。在这种情况下,使用GNU awk进行多字符RS和删除数组:
$ cat tst.awk
BEGIN {
RS = "\nSTOP\n"
OFS=" ~~ "
}
{
delete n2v
for (i=2;i<=NF;i+=3) {
n2v[$i] = $(i+2)
}
print n2v["X"], n2v["Y"]
}
$ gawk -f tst.awk file
123 ~~ abc
456 ~~
789 ~~ ghi
答案 1 :(得分:2)
基于我自己的How to select lines between two marker patterns which may occur multiple times with awk/sed解决方案:
awk -v OFS=" ~~ " '
/START/{flag=1;next}
/STOP/{flag=0; print first, second; first=second=""}
flag && $1=="X" {first=$3}
flag && $1=="Y" {second=$3}' file
$ awk -v OFS=" ~~ " '/START/{flag=1;next}/STOP/{flag=0; print first, second; first=second=""} flag && $1=="X" {first=$3} flag && $1=="Y" {second=$3}' a
123 ~~ abc
456 ~~
789 ~~ ghi
答案 2 :(得分:1)
因为我喜欢脑筋急转弯(不是因为这种事情在sed中是可行的),所以可能的sed解决方案是
sed -n '/START/,/STOP/ { //!H; // { g; /^$/! { s/.*\nX | \([^\n]*\).*/\1 ~~/; ta; s/.*/~~/; :a G; s/\n.*Y | \([^\n]*\).*/ \1/; s/\n.*//; p; s/.*//; h } } }'
其工作原理如下:
/START/,/STOP/ { # between two start and stop lines
//! H # assemble the lines in the hold buffer
# note that // repeats the previously
# matched pattern, so // matches the
# start and end lines, //! all others.
// { # At the end
g # That is: When it is one of the
/^$/! { # boundary lines and the hold buffer
# is not empty
s/.*\nX | \([^\n]*\).*/\1 ~~/ # isolate the X value, append ~~
ta # if there is no X value, just use ~~
s/.*/~~/
:a
G # append the hold buffer to that
s/\n.*Y | \([^\n]*\).*/ \1/ # and isolate the Y value so that
# the pattern space contains X ~~ Y
s/\n.*// # Cutting off everything after a newline
# is important if there is no Y value
# and the previous substitution did
# nothing
p # print the result
s/.*// # and make sure the hold buffer is
h # empty for the next block.
}
}
}