我需要处理分散到多行的记录。例如,我需要将多行记录转换为单行,然后从中获取我需要的任何内容。记录没有很好地划分,因此我不能将RS
设置为\n\n
。
cat input
constant_string bla bla1
bla bla bal
fooo foooooo baaar #End of record 1
constant_string bla1 bla2
abcd cdfe fghi jkhil
foo bar bar bar bar bar bar #End of record 2
constant_string bla bla3
random data is present #End of record 3
为实现这一目标,我通过在两条记录之间添加新行来将这些未划分的记录转换为划分为:
awk '{gsub(/^constant_string/,"\n&")}1' input
constant_string bla bla1
bla bla bal
fooo foooooo baaar
constant_string bla1 bla2
abcd cdfe fghi jkhil
foo bar bar bar bar bar bar
constant_string bla bla3
random data is present
获得划分的记录后,我可以将RS
设置为\n\n
并执行我需要的操作。
awk '{gsub(/^constant_string/,"\n&")}1' input |awk -v RS= '{$1=$1}1'
constant_string bla bla1 bla bla bal fooo foooooo baaar
constant_string bla1 bla2 abcd cdfe fghi jkhil foo bar bar bar bar bar bar
constant_string bla bla3 random data is present
问题:
我能够使用两个步骤实现解决方案,是否可以在awk中执行一步?
我试过以下但没有工作:
awk -v RS="" '{gsub(/^constant_string/,"\n&")}1' input
awk -v RS="" '{$0=gensub(/^constant_string/,"\n&",$0)}1' input
答案 0 :(得分:2)
如果您在下一个b
和constant_string
缓冲并处理END
怎么样?使用function
:
$ awk '
function process(str) { if(str!="") print str }
/^constant_string/ { process(b); b=$0; next }
{ b=b OFS $0 }
END { process(b) }
' file
constant_string bla bla1 bla bla bal fooo foooooo baaar
constant_string bla1 bla2 abcd cdfe fghi jkhil foo bar bar bar bar bar bar
constant_string bla bla3 random data is present
答案 1 :(得分:1)
awk 'BEGIN{ RS="(^|\n)constant_string"}
# filtering to avoid "empty" record
/./ {
# $1 is first "word" (FS is default) AFTER your constant string that is
# "removed" of $0 as Record separator.
# Info, this is now a multiline record
#... treat what you want
print " -- " NR : [" $0 "]"
for (i=1;i<=NF;i++) print NR "." i " : " $i
}
' YourFile
注意:
答案 2 :(得分:0)
如果您有GNU awk -
,请尝试此操作awk 'NR>1{gsub(/\n/," "); print RS$0}' RS='constant_string' f
constant_string bla bla1 bla bla bal fooo foooooo baaar
constant_string bla1 bla2 abcd cdfe fghi jkhil foo bar bar bar bar bar bar
constant_string bla bla3 random data is present