我有一个文本文件,其中包含以'WordNode'和'gloss word'开头的交换行,但有时会出现以'gloss word'开头的重复行:
WordNode"a'inai"
gloss word "repose"
WordNode "akti"
gloss word "running"
gloss word "turned on"
gloss word "active"
WordNode "aitco"
gloss word "Armenian"
WordNode "aitxero"
gloss word "ethereal"
gloss word "ether"
我希望能够将以前的wordNode ...行添加到以'gloss word'开头的每个重复行:
WordNode "a'inai"
gloss word "repose"
WordNode "akti"
gloss word "running"
WordNode "akti"
gloss word "turned on"
WordNode "akti"
gloss word "active"
WordNode "aitco"
gloss word "Armenian"
WordNode "aitxero"
gloss word "ethereal"
WordNode "aitxero"
gloss word "ether"
我试过这个
sed -r ':a; N; /(gloss word)[^\n]*\n\1/ s/\n.*//; ta; P; D' file1.txt > file2.txt
但它只保留第一个并删除以下重复行。使用sed awk或任何其他正则表达式执行此操作的正确方法是什么?
答案 0 :(得分:1)
$ awk '/^WordNode/{header=$0; p=0} p{print header} /^gloss word/{p=1} 1' file
WordNode"a'inai"
gloss word "repose"
WordNode "akti"
gloss word "running"
WordNode "akti"
gloss word "turned on"
WordNode "akti"
gloss word "active"
WordNode "aitco"
gloss word "Armenian"
WordNode "aitxero"
gloss word "ethereal"
WordNode "aitxero"
gloss word "ether"
救援!
ng build --prod
答案 1 :(得分:1)
这可能适合你(GNU sed):
sed '/WordNode/h;//d;x;p;x' file
将包含WordNode
的行存储在保留空间(HS)中,然后将其删除。对于所有其他行,即包含gloss word
的行,交换到HS,打印HS,然后恢复到模式空间(PS)并打印出来。
答案 2 :(得分:0)
这很容易通过脚本而不是sed或awk来完成:
while IFS= read -r line; do
if [[ $line == WordNode* ]]; then wnl=$line; else echo $wnl; echo $line; fi
done << file1.txt
(这只会回到WordNode
行之前的最后gloss word
行,所以如果您希望将多条WordNode
行放在一起,并希望全部回显它们,那么您&#39 ; d必须将其调整为有状态)
答案 3 :(得分:0)
$ awk '/WordNode/{h=$0 ORS;next} {print h $0}' file
WordNode"a'inai"
gloss word "repose"
WordNode "akti"
gloss word "running"
WordNode "akti"
gloss word "turned on"
WordNode "akti"
gloss word "active"
WordNode "aitco"
gloss word "Armenian"
WordNode "aitxero"
gloss word "ethereal"
WordNode "aitxero"
gloss word "ether"