我正在尝试修改我在网上找到的Perl脚本,该脚本使用XML :: Parser来识别XML文档的唯一元素以及每个元素出现的次数。可以找到Perl脚本及其文档here:
use strict;
use warnings;
use XML::Parser;
use File::Find;
@ARGV or die "usage: xmlelements DIR [DIR ...]\n";
my %element_count;
my $parser = XML::Parser->new(
Handlers => {
Start => \&start_element,
},
);
find \&process_xml, @ARGV;
print "$_ ($element_count{ $_ })\n"
for sort keys %element_count;
exit;
sub process_xml {
$parser->parsefile( $_ )
if substr( $_, -4 ) eq '.xml' and -f;
}
sub start_element {
my ( $expat, $element, @attrval ) = @_;
$element_count{ $element }++;
}
这会生成如下输出:
Account (15614)
Account_No (15504)
Active (15614)
Activity (6658)
Address (28098)
Address_1 (27548)
Address_2 (2033)
Address_3 (62)
Address_City (15)
我的问题是如何在输出中包含父节点?
答案 0 :(得分:1)
父节点名称由current_element
给出,所以$name= $expat->current_element . '/' . $element
和voilà!如果要避免空父母姓名的警告,请使用my $parent= $expat->current_element || ''; $name = "$parent/$element";
所以处理程序变为:
sub start_element {
my ( $expat, $element, @attrval ) = @_;
my $parent= $expat->current_element || '';
my $name= "$parent/$element";
$element_count{$name}++;
}