Question

我需要sed命令的帮助来从xml文件中删除一些元素。我知道我可以使用撒克逊来完成这项工作，但我认为它效率不高。

情况：我从外部脚本接收一个数组，对于数组中的所有字段，我想使用sed命令检查xml文件，如果该字段包含在特定元素中，如果结果为true我想删除该元素和包含所有孩子的父元素。

xml文件如下：

<league>
   <team>club1</team>
      <id>1001</id> 
      <position>12</position>
      <description>
         <comment>frist comment</comment>
         <logo>no logo available</logo>
         <path>/home/stack/overflow/id111-111-222-222</path>
      </description>
   </team> 
   <team>club2</team>
      <id>1002</id> 
      <position>42</position>
      <description>
         <comment>second comment</comment>
         <logo>logo available</logo>
         <path>/home/stack/overflow/id333-333-444-444</path>
      </description>
   </team>

   ...

   <team>clubn</team>
      <id>100n</id> 
      <position>n</position>
      <description>
         <comment>nth comment</comment>
         <logo>no logo available</logo>
         <path>/home/stack/overflow/id888-888-999-999</path>
      </description>
   </team>

现在对于一个数组字段sed应检查该字段是否包含在任何路径元素中。如果它在任何路径中，那么它应该删除父元素和所有其他子元素的那个元素（路径）。

例如，数组提交[1] = 888-888-999-999。结果应如下所示：

<league>
   <team>club1</team>
      <id>1001</id> 
      <position>12</position>
      <description>
         <comment>frist comment</comment>
         <logo>no logo available</logo>
         <path>/home/stack/overflow/id111-111-222-222</path>
      </description>
   </team> 
   <team>club2</team>
      <id>1002</id> 
      <position>42</position>
      <description>
         <comment>second comment</comment>
         <logo>logo available</logo>
         <path>/home/stack/overflow/id333-333-444-444</path>
      </description>
   </team>

   ...

   <team>clubn</team>
      <id>100n</id> 
      <position>n</position>
   </team>

我希望任何人都能理解我和我的问题：）

贪欲却无止境

Answer 1

这会加入filed数组字段以形成交替，然后让sed收集父元素的行并删除它们如果包含数组字段路径元素：

IFS='|' regexp="${filed[*]}"
sed "/<description>/{:label;/<\/description>/!{N;blabel};
                     /<path>.*\(${regexp//|/\|}\).*<\/path>/d}"

sed命令删除xml-element cantains字符串的一部分

1 个答案: