我有一个我试图解析的xml文件。它有类似
的东西 **</data_item>
</data_item>
</data_item>**
<xml version>
</data_item>
<some random text>
</data_item>
<some random text>
**</data_item>
</data_item>
</data_item>**
<xml version>
**</data_item>
</data_item>
</data_item>**
以粗体突出显示的行有3个data_items,背靠背(最后一组3除外)我想删除其中两个而只保留1个。这些事件中有7-8个我正在尝试使用字符串xml版本到达它上面的两行并删除它们。请帮助我使用sed one liner做到这一点。
答案 0 :(得分:0)
以下是使用GNU sed
执行此操作的一种方法:
$ sed '/<\/data_item>/{N;/<\/data_item>$/{N;$!{s/\n//;D}}}' file
</data_item>
<xml version>
</data_item>
<some random text>
</data_item>
<some random text>
</data_item>
<xml version>
</data_item>
</data_item>
</data_item>
<强>解释强>
sed '
/<\/data_item>/ { # Look for lines matching this pattern
N # Append the next line to pattern space
/<\/data_item>$/ { # If the line matches our pattern
N # Append the next line to pattern space
$! { # If it is not end of file
s/\n// # Replace the first new line with nothing
D # Delete up to first newline
}
}
}
' file