XML :: twig将节点树从一个文件复制到另一个文件

时间:2015-12-24 10:43:14

标签: xml perl perl-module xml-twig

我有两个巨大的XML文件(> 50M),并希望将一个节点树从一个XML追加到另一个XML。

test1.xml

<?xml version="1.0" encoding="UTF-8"?>
<root>
<blocks>
    <block name="block1" id="1">
        <lotsofcontents/>
    </block>
    <block name="block3" id="3">
            <lotsofcontents />
    </block>
</blocks>
</root>

test2.xml

<?xml version="1.0" encoding="UTF-8"?>
<root>
<blocks>
    <block name="block2" id="2">
        <lotsofcontents/>
    </block>
    <block name="block4" id="4">
            <lotsofcontents />
    </block>
</blocks>
</root>

我想将test2.xml中的所有块复制到test1.xml,结果如下所示:

<?xml version="1.0" encoding="UTF-8"?>
<root>
<blocks>
    <block name="block1" id="1">
        <lotsofcontents/>
    </block>
    <block name="block3" id="3">
        <lotsofcontents/>
    </block>
    <block name="block2" id="2">
        <lotsofcontents/>
    </block>
    <block name="block4" id="4">
        <lotsofcontents/>
    </block>
</blocks>
</root>

我尝试使用XML Twig代码,如下所示:

combine.pl

use strict;
use XML::Twig;

my $testa = "test1.xml";
my $testb = "test2.xml";
my $result = "result.xml";
my $MDTAG = "block";

my @blocks;

my $twig = XML::Twig->new( twig_handlers => {
                                            $MDTAG => sub {push @blocks; $_->cut_children; },
                                        },
                       pretty_print => 'indented',
                       empty_tags => 'expand',
                     );
$twig->parsefile($testb);

my $phere = XML::Twig->new( twig_handlers => {
                                            $MDTAG => sub { foreach my $block (@blocks)
                                                    { $block->paste( first_child => $_); } },
                                         },
                       pretty_print => 'indented',
                     );
$phere->parsefile($testa);
$phere->print_to_file($result);

我收到如下警告,并且生成了result.xml,但没有附加任何来自test2.xml的内容。

Useless use of push with no values at combine.pl line 16.
Use of tied on a handle without * is deprecated at /opt/perl/lib/XML/Parser/Expat.pm line 447.
Use of tied on a handle without * is deprecated at /opt/perl/lib/XML/Parser/Expat.pm line 447.

感谢任何更正和评论。

1 个答案:

答案 0 :(得分:1)

我认为,对于主文档,您需要为blocks元素设置处理程序,而不是为每个block元素设置处理程序。

使用

use strict;
use XML::Twig;

my $testa = "test2015122401.xml";
my $testb = "test2015122402.xml";
my $result = "result2015122401.xml";

my @blocks;

my $t1= XML::Twig->new(  
         twig_handlers   => { 'block' => sub { push @blocks, $_; $_->cut(); }}
                    );
$t1->parsefile( $testb);


my $phere = XML::Twig->new( twig_handlers => {
                                            '/root/blocks' => sub { 
                                               foreach my $block (@blocks) {
                                                 $block->paste(last_child => $_);
                                               }
                                             }
                                         },
                       pretty_print => 'indented',
                     );
$phere->parsefile($testa);
$phere->print_to_file($result);

我为您发布的样本获得了所需的结果。