我有一些XML文件,如下所示:
<machines>
<server>
127.0.0.1
</server>
<proxy>
<ip>127.0.0.2</ip>
<etc>abc</etc>
</proxy>
</machines>
我希望保留服务器并删除其他服务器,输出应为:
<machines>
<server>
127.0.0.1
</server>
</machines>
我写了如下脚本:
use warnings;
use strict;
use feature ':5.10';
use XML::Twig;
my $path='C:\strawberry\perl\site\lib\file.xml';
my $filehandle;
my $tweak_server =sub{
my ($twig, $root) =@_;
my $elt=$root;
while( $elt=$elt->next_elt($root)){
my $tag=$elt->tag;
say $tag;
if ($tag!~/server/){
$elt->delete($tag);
}
}
$twig->flush;
};
open( $filehandle, "+<$path") or die "cannot open out file out_file:$!";
my $roots = { machines => 1 };
my $handlers = { 'machines' => $tweak_server,
};
my $twig = new XML::Twig(TwigRoots => $roots,
TwigHandlers => $handlers,
pretty_print => 'indented'#,
# twig_print_outside_roots => \*$filehandle
);
$twig->parsefile($path);
close $filehandle;
得到了输出:
server
#PCDATA
<machines>
<server></server>
<proxy>
<ip>127.0.0.2</ip>
<etc>abc</etc>
</proxy>
</machines>
我真的不明白为什么会出现“#PCDATA”以及为什么它不像我预期的那样起作用?
@mirod我尝试如下:
use warnings;
use strict;
use feature ':5.10';
use XML::Twig;
my $tweak_server =sub{
my ($twig, $root) =@_;
my $elt=$root;
my $text=$elt->first_child_text('id');
if ($text=~m/12/){
while( $elt=$elt->next_elt('#ELT')){
my $tag=$elt->tag;
say $tag;
if ($tag!~/id/){
$elt->delete;
}
}
}
};
my $roots = { machines => 1 };
my $handlers = { 'machines/aaa' => $tweak_server,
};
my $twig =XML::Twig->new(TwigRoots => $roots,
TwigHandlers => $handlers,
pretty_print => 'indented'#,
# twig_print_outside_roots => \*$filehandle
)
->parse( \*DATA)
->print;
__DATA__
<machines>
<server> 127.0.0.1 </server>
<aaa>
<id>12</id>
<ip>127.0.0.2</ip>
<option>127.0.0.6</option>
<etc>abc</etc>
</aaa>
<aaa>
<id>14</id>
<ip>127.0.0.2</ip>
<etc>abc</etc>
</aaa>
<aaa>
<id>15</id>
<ip>127.0.0.2</ip>
<etc>abc</etc>
</aaa>
</machines>
,输出为:
<machines>
<server> 127.0.0.1 </server>
<aaa>
<id>12</id>
<option>127.0.0.6</option>
<etc>abc</etc>
</aaa>
<aaa>
<id>14</id>
<ip>127.0.0.2</ip>
<etc>abc</etc>
</aaa>
<aaa>
<id>15</id>
<ip>127.0.0.2</ip>
<etc>abc</etc>
</aaa>
</machines>
我想要的是删除这三个元素,而不只是一个:
<ip>127.0.0.2</ip>
<option>127.0.0.6</option>
<etc>abc</etc>
元素下的
<id>12</id>
任何建议?
答案 0 :(得分:2)
以下内容将删除proxy
元素:
use warnings;
use strict;
use XML::Twig;
my $str = '
<machines>
<server>
127.0.0.1
</server>
<proxy>
<ip>127.0.0.2</ip>
<etc>abc</etc>
</proxy>
</machines>
';
my $t = XML::Twig->new(
twig_handlers => {
proxy => sub { $_->delete() },
},
pretty_print => 'indented',
);
$t->parse($str);
$t->print($str);
print "\n";
__END__
<machines>
<server>
127.0.0.1
</server>
</machines>
如果您不想打印server
和#PCDATA
,请删除say $tag;
。
答案 1 :(得分:2)
如果您的要求是仅保留服务器元素,那么您可以通过将它们设置为twig_roots
来告诉模块。这将保留XML和服务器元素(及其内容)的根,同时丢弃所有其余部分:
#!/usr/bin/perl
use strict;
use warnings;
use XML::Twig;
XML::Twig->new( twig_roots => { server => 1 },
pretty_print => 'indented',
)
->parse( \*DATA)
->print;
__DATA__
<machines>
<server>
127.0.0.1
</server>
<proxy>
<ip>127.0.0.2</ip>
<etc>abc</etc>
</proxy>
</machines>