如何在以相同单词开头的连续行之间添加字符串?

时间:2017-05-16 12:55:24

标签: bash awk sed

我有一个文本文件,其中包含以'WordNode'和'gloss word'开头的交换行,但有时会出现以'gloss word'开头的重复行:

WordNode"a'inai"
gloss word "repose"
WordNode "akti"
gloss word "running"
gloss word "turned on"
gloss word "active"
WordNode "aitco"
gloss word "Armenian"
WordNode "aitxero"
gloss word "ethereal"
gloss word "ether"

我希望能够将以前的wordNode ...行添加到以'gloss word'开头的每个重复行:

WordNode "a'inai"
gloss word "repose"
WordNode "akti"
gloss word "running"
WordNode "akti"
gloss word "turned on"
WordNode "akti"
gloss word "active"
WordNode "aitco"
gloss word "Armenian"
WordNode "aitxero"
gloss word "ethereal"
WordNode "aitxero"
gloss word "ether"

我试过这个

sed -r ':a; N; /(gloss word)[^\n]*\n\1/ s/\n.*//; ta; P; D' file1.txt > file2.txt

但它只保留第一个并删除以下重复行。使用sed awk或任何其他正则表达式执行此操作的正确方法是什么?

4 个答案:

答案 0 :(得分:1)

$ awk '/^WordNode/{header=$0; p=0} p{print header} /^gloss word/{p=1} 1' file WordNode"a'inai" gloss word "repose" WordNode "akti" gloss word "running" WordNode "akti" gloss word "turned on" WordNode "akti" gloss word "active" WordNode "aitco" gloss word "Armenian" WordNode "aitxero" gloss word "ethereal" WordNode "aitxero" gloss word "ether" 救援!

ng build --prod

答案 1 :(得分:1)

这可能适合你(GNU sed):

sed '/WordNode/h;//d;x;p;x' file

将包含WordNode的行存储在保留空间(HS)中,然后将其删除。对于所有其他行,即包含gloss word的行,交换到HS,打印HS,然后恢复到模式空间(PS)并打印出来。

答案 2 :(得分:0)

这很容易通过脚本而不是sed或awk来完成:

while IFS= read -r line; do
    if [[ $line == WordNode* ]]; then wnl=$line; else echo $wnl; echo $line; fi
done << file1.txt

(这只会回到WordNode行之前的最后gloss word行,所以如果您希望将多条WordNode行放在一起,并希望全部回显它们,那么您&#39 ; d必须将其调整为有状态)

答案 3 :(得分:0)

$ awk '/WordNode/{h=$0 ORS;next} {print h $0}' file
WordNode"a'inai"
gloss word "repose"
WordNode "akti"
gloss word "running"
WordNode "akti"
gloss word "turned on"
WordNode "akti"
gloss word "active"
WordNode "aitco"
gloss word "Armenian"
WordNode "aitxero"
gloss word "ethereal"
WordNode "aitxero"
gloss word "ether"