我的XML数据看起来像this:
我想要做的就是简单地提取 出版年份来自以下结构:
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<pre>
<PubmedArticle>
<MedlineCitation Owner="NLM" Status="In-Data-Review">
<PMID Version="1">23853691</PMID>
<DateCreated>
<Year>2013</Year>
<Month>07</Month>
<Day>15</Day>
</DateCreated>
<Article PubModel="Electronic-Print">
<Journal>
<ISSN IssnType="Electronic">1932-6203</ISSN>
<JournalIssue CitedMedium="Internet">
<Volume>8</Volume>
<Issue>5</Issue>
<PubDate>
<Year>2013</Year>
</PubDate>
...
</pre>
但为什么我的以下Perl代码未能访问“2013”年?
use strict;
use Data::Dumper;
use XML::LibXML 1.70;
my $parser = XML::LibXML->new();
my $xmlfilename = "myfile.xml";
# obtained from http://dpaste.com/1307466/plain/
my $doc = $parser->parse_file( $xmlfilename );
foreach my $art ( $doc->findnodes('/PubmedArticle/MedlineCitation/Article/Journal/JournalIssue/PubDate') ) {
my ($year) = $art->findnodes('./Year');
print Dumper $year->to_literal ;
}
这样做的正确方法是什么?
答案 0 :(得分:3)
您忘记了pre
根元素。
更改
/PubmedArticle/MedlineCitation/Article/Journal/JournalIssue/PubDate
到
/pre/PubmedArticle/MedlineCitation/Article/Journal/JournalIssue/PubDate