我有两个巨大的XML文件(> 50M),并希望将一个节点树从一个XML追加到另一个XML。
test1.xml
<?xml version="1.0" encoding="UTF-8"?>
<root>
<blocks>
<block name="block1" id="1">
<lotsofcontents/>
</block>
<block name="block3" id="3">
<lotsofcontents />
</block>
</blocks>
</root>
test2.xml
<?xml version="1.0" encoding="UTF-8"?>
<root>
<blocks>
<block name="block2" id="2">
<lotsofcontents/>
</block>
<block name="block4" id="4">
<lotsofcontents />
</block>
</blocks>
</root>
我想将test2.xml中的所有块复制到test1.xml,结果如下所示:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<blocks>
<block name="block1" id="1">
<lotsofcontents/>
</block>
<block name="block3" id="3">
<lotsofcontents/>
</block>
<block name="block2" id="2">
<lotsofcontents/>
</block>
<block name="block4" id="4">
<lotsofcontents/>
</block>
</blocks>
</root>
我尝试使用XML Twig代码,如下所示:
combine.pl
use strict;
use XML::Twig;
my $testa = "test1.xml";
my $testb = "test2.xml";
my $result = "result.xml";
my $MDTAG = "block";
my @blocks;
my $twig = XML::Twig->new( twig_handlers => {
$MDTAG => sub {push @blocks; $_->cut_children; },
},
pretty_print => 'indented',
empty_tags => 'expand',
);
$twig->parsefile($testb);
my $phere = XML::Twig->new( twig_handlers => {
$MDTAG => sub { foreach my $block (@blocks)
{ $block->paste( first_child => $_); } },
},
pretty_print => 'indented',
);
$phere->parsefile($testa);
$phere->print_to_file($result);
我收到如下警告,并且生成了result.xml,但没有附加任何来自test2.xml的内容。
Useless use of push with no values at combine.pl line 16.
Use of tied on a handle without * is deprecated at /opt/perl/lib/XML/Parser/Expat.pm line 447.
Use of tied on a handle without * is deprecated at /opt/perl/lib/XML/Parser/Expat.pm line 447.
感谢任何更正和评论。
答案 0 :(得分:1)
我认为,对于主文档,您需要为blocks
元素设置处理程序,而不是为每个block
元素设置处理程序。
使用
use strict;
use XML::Twig;
my $testa = "test2015122401.xml";
my $testb = "test2015122402.xml";
my $result = "result2015122401.xml";
my @blocks;
my $t1= XML::Twig->new(
twig_handlers => { 'block' => sub { push @blocks, $_; $_->cut(); }}
);
$t1->parsefile( $testb);
my $phere = XML::Twig->new( twig_handlers => {
'/root/blocks' => sub {
foreach my $block (@blocks) {
$block->paste(last_child => $_);
}
}
},
pretty_print => 'indented',
);
$phere->parsefile($testa);
$phere->print_to_file($result);
我为您发布的样本获得了所需的结果。