如何使用Perl和XPath从此XML文件中提取所需的节点?

时间:2010-05-19 12:20:21

标签: perl xpath libxml2

执行XPath表达式以从XML DB文件中提取与死亡率相关的所有年份和值元素后,我想从节点列表中获取每个节点并找到年份节点,打印它,找到值节点,以及打印所有单独的。问题是输出没有显示任何内容。

XML内容如下所示:

<dataset type="country" name="Afghanistan" total="222">
...
        <data>
             <country id="AFG">Afghanistan</country>
             <indicator id="SP.DYN.CDRT.IN">Death rate, crude (per 1,000 people)</indicator>
             <year>2006</year>
             <value>20.3410000</value>
           </data>
           <data>
             <country id="AFG">Afghanistan</country>
             <indicator id="SP.DYN.CDRT.IN">Death rate, crude (per 1,000 people)</indicator>
             <year>2007</year>
             <value>19.9480000</value>
           </data>
           <data>
             <country id="AFG">Afghanistan</country>
             <indicator id="SP.DYN.CDRT.IN">Death rate, crude (per 1,000 people)</indicator>
             <year>2008</year>
             <value>19.5720000</value>
           </data>
           <data>
             <country id="AFG">Afghanistan</country>
             <indicator id="IC.EXP.DOCS">Documents to export (number)</indicator>
             <year>2005</year>
             <value>7.0000000</value>
           </data>
           <data>
             <country id="AFG">Afghanistan</country>
             <indicator id="IC.EXP.DOCS">Documents to export (number)</indicator>
             <year>2006</year>
             <value>12.0000000</value>
           </data>
           <data>
             <country id="AFG">Afghanistan</country>
             <indicator id="IC.EXP.DOCS">Documents to export (number)</indicator>
             <year>2007</year>
             <value>12.0000000</value>
           </data>
...
</dataset>

Perl代码如下所示:

#Use XML Xlib parser to find elements related to death rate

my $parser = XML::LibXML->new();
my $tree = $parser->parse_file($XML_DB);
my $root = XML::LibXML::XPathContext->new($tree->documentElement());
#print $nodeSet->to_literal(); 

foreach my $node ($root->findnodes("/*/data/indicator[\@id = 'SP.DYN.CDRT.IN']/following-sibling::*")) {
    #print $node->textContent() . "\n";
    #print $node->nodeName . "\n";
    print $node->find("year") . "\n";
}
exit;

1 个答案:

答案 0 :(得分:2)

year中的find("year")表达式无法像您认为的那样工作,因为复杂的选择器不会以data节点结束。使用Xacobeo调试XPath表达式。这有效:

foreach my $node ($root->findnodes(q{/*/data/indicator[@id = 'SP.DYN.CDRT.IN']/following-sibling::*})) {
    say $_->toString for $node->childNodes;
}

输出:

2006
20.3410000
2007
19.9480000
2008
19.5720000