我试图从XML文件中解析抽象部分。我正在使用forcearray。我编写了代码但它只是在抽象是在数组中时才工作而在数组不存在时不工作。这是因为在数组中我也使用{content},当不在数组中时,{content}缺失。代码如下
use LWP::Simple;
use XML::Simple;
use Data::Dumper;
open (FH, ">:utf8","xmlparsed2.txt");
my $db1 = "pubmed";
my $query = "9915366";
my $q = 16404398;
my $xml = new XML::Simple;
$urlxml = "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=$db1&id=$q&retmode=xml&rettype=abstract";
$dataxml = get($urlxml);
$data = $xml->XMLin("$dataxml", ForceArray => [qw( MeshHeading Author AbstractText )], ForceContent => 1);
print FH Dumper($data);
print FH "Abstract: ".join "\n", map {join ":",($_->{NlmCategory},$_->{content})} @{$data->{PubmedArticle}->{MedlineCitation}->{Article}->{Abstract}->{AbstractText}};
print FH "\n";
print FH "Title: "."$data->{PubmedArticle}->{MedlineCitation}->{Article}->{ArticleTitle}\n";
print FH "\n";
print FH "MeSH: ".join '$$', map $_->{DescriptorName}{content}, @{$data->{PubmedArticle}->{MedlineCitation}->{MeshHeadingList}->{MeshHeading}};
print FH "\n";
print FH "Authors: ".join '$$', map {join " ",($_->{LastName},$_->{ForeName})} @{$data->{PubmedArticle}{MedlineCitation}{Article}{AuthorList}{Author}};
好吧,当在数组中(在$ urlxml中通过$ query复制$ q)我希望抽象与其NlmCategory类似目标:确定是否...... 。对于上面的代码,它给了我想要的输出,但最后用哈希如下:
METHODS:Tertiary care outpatient and inpatient rehabilitation center directly attached to a university hospital.:HASH(0x69d0810).
对于抽象,它不是一个数组($ urlxml中的$ q),这段代码似乎不起作用,可能是因为没有内容术语(我在数据转储器中发现了这一点)。我玩了一下,如果我为数组做一些像$ _这样的东西,但也会打印两个::。总之,我希望我的代码适用于$ query和$ q。你能帮忙吗?
答案 0 :(得分:4)
使用ForceContent => 1
。
或者:
use strict;
use warnings;
use feature qw( say );
use LWP::Simple qw( get );
use XML::LibXML qw( );
use URI qw( );
binmode STDOUT, ':encoding(UTF-8)';
my $db = "pubmed";
my $id = $ARGV[0] || '9915366';
my $url = URI->new('http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi');
$url->query_form(
db => $db,
id => $id,
retmode => 'xml',
rettype => 'abstract',
);
my $xml = get($url);
my $parser = XML::LibXML->new();
my $doc = $parser->parse_string($xml);
my $root = $doc->documentElement();
for my $node ($root->findnodes('PubmedArticle/MedlineCitation/Article/Abstract/AbstractText')) {
say join ':', $node->getAttribute('NlmCategory') // '', $node->textContent();
}