sub parse_xml{
my $xml_link = $_[0];
my $xml_content = get($xml_link) or warn "Cant get XML page of " . $xml_link . "\n";
if(!$xml_content){
return;
}
my $xml = XML::Simple->new(KeepRoot => 1);
my $xml_data = $xml->XMLin($xml_content);
my @items = $xml_data->{rss}{channel}->{item};
# print Dumper($xml_data);
foreach my $item (@items) {
if($item){
print Dumper($item); //This is the dump output
print $item->{author};
#print $item . "\n";
}
}
}
当我尝试输出该项目的作者时,我只得到HASH(Memory Address)
或not a hash reference at ... line ...
我做错了吗?为什么会产生这个错误?
这是转储器输出。
$VAR1 = [
{
'link' => 'http://***.com/article/news/betty-white-credits-snickers-golden-opportunities/144290/#comments-67229',
'author' => {},
'title' => 'By: ',
'pubDate' => 'Tue, 08 Jun 2010 12:47 EDT',
'description' => 'Interesting. At least SHE remembered the product that propelled her to recent recognition. When many people I know have commented on how they loved that Betty White Super Bowl spot, they can't recall the product. Ah, advertising.'
},
{
'link' => 'http://***.com/article/news/betty-white-credits-snickers-golden-opportunities/144290/#comments-67167',
'author' => {},
'title' => 'By: ',
'pubDate' => 'Mon, 07 Jun 2010 13:26 EDT',
'description' => 'Fun, fun, fun. A great attitude for all of us to take into our careers.'
},
{
'link' => 'http://****.com/article/news/betty-white-credits-snickers-golden-opportunities/144290/#comments-67164',
'author' => 'username',
'title' => 'By: username',
'pubDate' => 'Mon, 07 Jun 2010 12:23 EDT',
'description' => 'Her appearance of the Comedy Central roast of William Shattner a couple of years ago was great... it seems like her willingness to be irreverent makes her more appealing to us all!
www.adverspew.com'
},
{
'link' => 'http://****.com/article/news/betty-white-credits-snickers-golden-opportunities/144290/#comments-67142',
'author' => {},
'title' => 'By: ',
'pubDate' => 'Mon, 07 Jun 2010 09:50 EDT',
'description' => 'Solid interview. I will definitely be tuning into "Hot in Cleveland" next week. We ought to enjoy Ms. White's talents for as long as we have her. She's great!'
}
];
答案 0 :(得分:1)
你正走在正确的轨道上。我已经在这个StackOverflow页面链接的新闻源上使用了你的代码,并对它进行了微调。
use LWP::Simple;
use XML::Simple;
use Data::Dumper;
sub parse_xml{
my $xml_link = $_[0];
my $xml_content = get($xml_link) or warn "Cant get XML page of " . $xml_link . "\n";
if(!$xml_content){
return;
}
my $xml = XML::Simple->new(KeepRoot => 1);
my $xml_data = $xml->XMLin($xml_content,ForceArray =>'entry');
foreach my $item ($xml_data->{'feed'}[0]->{'entry'}) {
foreach my $entry (@{$item}){
if($entry){
print $entry->{'author'}[0]->{'name'}[0]."\n";
print $entry->{'author'}[0]->{'uri'}[0]."\n";
}
}
}
}
parse_xml('http://stackoverflow.com/feeds/question/10906521');
在该示例中正常工作。我怀疑你可能正在尝试打印出一些不是普通值的东西 - 在stackoverflow页面的例子中,你可以看到'author'实际上包含一些子节点,所以如果你尝试打印$ item - 在foreach循环中的> {'author'},您将获得您描述的“HASH”结果。
看看你的转储和鲍罗丁的明智评论,这应该适合你:
my $xml_data = $xml->XMLin($xml_content,ForceArray =>'entry');
my $item = $xml_data->{'rss'}[0]->{'channel'}[0]->{'item'};
foreach my $entry (@{$item}){
if($entry){
if(!ref $entry->{'author'}[0]){
print $entry->{'author'}[0]."\n";
}
if(!ref $entry->{'description'}[0]){
print $entry->{'description'}[0]."\n";
}
if(!ref $entry->{'pubDate'}[0]){
print $entry->{'pubDate'}[0]."\n";
} # etc.
}
答案 1 :(得分:1)
此RSS Feed可能包含或不包含每个项目的<author>
信息。
如果没有作者,那么该元素仍会出现在XML中,但它没有内容。它显示为<author></author>
。
XML::Simple
将此表示为空的匿名哈希。
因此,如果有项目的作者信息,$item->{author}
将是一个简单的文本字符串。否则它将是一个哈希引用。
您可以通过编写
来编写代码foreach my $item (@items) {
my $author = $item->{author};
$author = '' if ref $author;
print "$item\n";
}