使用Perl修改文件

时间:2012-09-13 08:42:34

标签: regex perl xml-parsing

我有一个看起来像

的xml文件
     <?xml version="1.0"?>
     <product = "AAA">
     <shell name = "110">
         <style = "000" size ="3"/>
         <style = "200" size ="3"/>
         <style = "800" size ="1"/>
         <style = "0900" size ="3"/>
     </shell name>
     </product>
     <product = "AAA">
     <shell name = "310">
         <style = "000" size ="3"/>
         <style = "200" size ="3"/>
         <style = "800" size ="1"/>
         <style = "0900" size ="3"/>
     </shell name>
     </product>
     <product = "BBB">
     <shell name = "10">
         <style = "000" size ="3"/>
         <style = "200" size ="3"/>
         <style = "800" size ="1"/>
         <style = "0900" size ="3"/>

     </shell name>
     </product>
     <product = "BBB">
      <shell name = "10010">
         <style = "0300" size ="3"/>
         <style = "2030" size ="3"/>
         <style = "8003" size ="1"/>
         <style = "09003" size ="3"/>
     </shell name>
     </product>
     <product = "BBB">
      <shell name = "110">
         <style = "0300" size ="3"/>
         <style = "2030" size ="3"/>
         <style = "8003" size ="1"/>
         <style = "09003" size ="3"/>
     </shell name>
     </product>

我想编写一个脚本,该脚本应该支持同一产品的shell,以便我有一个像

这样的输出
     <?xml version="1.0"?>
     <product = AAA>
     <shell name = "110">
         <style = "000" size ="3"/>
         <style = "200" size ="3"/>
         <style = "800" size ="1"/>
         <style = "0900" size ="3"/>
     </shell name>
    <shell name = "310">
         <style = "000" size ="3"/>
         <style = "200" size ="3"/>
         <style = "800" size ="1"/>
         <style = "0900" size ="3"/>
     </shell name>
   </product>
     <product = BBB>
    <shell name = "10">
         <style = "000" size ="3"/>
         <style = "200" size ="3"/>
         <style = "800" size ="1"/>
         <style = "0900" size ="3"/>
     </shell name>
     <shell name = "10010">
         <style = "0300" size ="3"/>
         <style = "2030" size ="3"/>
         <style = "8003" size ="1"/>
         <style = "09003" size ="3"/>
     </shell name>
     <shell name = "110">
         <style = "0300" size ="3"/>
         <style = "2030" size ="3"/>
         <style = "8003" size ="1"/>
         <style = "09003" size ="3"/>
     </shell name>
     </product>

关于我如何进行的任何想法?我正在考虑搜索字符串&lt; / product&gt;

 <product = "AAA">

但我怎样才能达到第二次出现。我知道如何读取文件并计算特定字符串的出现次数,但任何人都可以帮助我如何达到特定字符串的第二次出现?

2 个答案:

答案 0 :(得分:2)

我将您的XML文件修改为格式良好:

<?xml version="1.0"?>
<products>
     <product id="AAA">
     <shell name="110">
         <style n="000" size="3"/>
         <style n="200" size="3"/>
         <style n="800" size="1"/>
         <style n="0900" size="3"/>
     </shell>
     </product>
     <product id="AAA">
     <shell name="310">
         <style n="000" size="3"/>
         <style n="200" size="3"/>
         <style n="800" size="1"/>
         <style n="0900" size="3"/>
     </shell>
     </product>
     <product id="BBB">
     <shell name="10">
         <style n="000" size="3"/>
         <style n="200" size="3"/>
         <style n="800" size="1"/>
         <style n="0900" size="3"/>

     </shell>
     </product>
     <product id="BBB">
      <shell name="10010">
         <style n="0300" size="3"/>
         <style n="2030" size="3"/>
         <style n="8003" size="1"/>
         <style n="09003" size="3"/>
     </shell>
     </product>
     <product id="BBB">
      <shell name="110">
         <style n="0300" size="3"/>
         <style n="2030" size="3"/>
         <style n="8003" size="1"/>
         <style n="09003" size="3"/>
     </shell>
     </product>
</products>

然后我使用XML::XSH2处理它:

open 1.xml ;
for //product {
    my $id = @id ;
    mv shell append //product[@id=$id][1] ;
}
rm //product[not(shell)] ;
save --backup ;

答案 1 :(得分:1)

使用XML :: Twig,您可以这样做:

#!/usr/bin/perl

use strict;
use warnings;

use XML::Twig;

XML::Twig->new( twig_handlers => { product => \&product },
                pretty_print => 'indented',
              )
         ->parsefile_inplace( 'so_conc.xml');

sub product
  { my( $t, $product)= @_;
    my $prev_product=  $product->prev_sibling( 'product') || return;
    if( $product->id eq $prev_product->id)
      { $product->first_child( 'shell')->move( last_child => $prev_product);
        $product->delete;
      }
    else
      { $t->flush_up_to( $prev_product); }
  }

flush_up_to行确保一次只将一个产品保留在内存中,并且与parsefile_inplace的调用一起更新原始文件。