如何使XML :: RSS :: Parser从URL而不是文件获取输入

时间:2013-06-16 01:28:21

标签: perl xml-parsing

我有这个简短的Perl脚本,它使用XML::RSS::Parser从XML文件中提取一些信息。

#!/usr/bin/perl -w
use strict;

use XML::RSS::Parser;
use FileHandle;
use Data::Dumper;
use URI;

my $p = XML::RSS::Parser->new;
my $fh = FileHandle->new('ronicky-doone-by-max-brand.xml');

my $feed = $p->parse_file( $fh );
print $p->errstr;

my $feed_title = $feed->query('/channel/title');
print $feed_title->text_content;

my $feed_desc = $feed->query('/channel/description');
print $feed_desc->text_content;

我知道有parse_uri方法,但我似乎无法将我的网址http://librivox.org/bookfeeds/ronicky-doone-by-max-brand.xml转换为可以将参数传递给XML::RSS::Parser::parse_uri的URI。

1 个答案:

答案 0 :(得分:3)

我不知道你尝试了什么,但它非常简单。也许使用FileHandle困惑了你?

此版本的代码运行正常。请注意-w命令行选项多年前被use warnings替换,除了短命令行Perl程序。此外,我不得不将STDOUT设置为期望UTF-8,因为此RSS源中有一些扩展字符。

use strict;
use warnings;

use XML::RSS::Parser;

binmode STDOUT, ':encoding(UTF-8)';

my $parser = XML::RSS::Parser->new;
my $feed = $parser->parse_uri('http://librivox.org/bookfeeds/ronicky-doone-by-max-brand.xml');

printf "Title: %s\n", $feed->query('/channel/title')->text_content;

printf "Description: %s\n", $feed->query('/channel/description')->text_content;

<强>输出

Title: Librivox: Ronicky Doone by Brand, Max
Description:  Frederick Schiller Faust (1892-1944), is best known today for his western fiction. Faust was born in Seattle, Washington and at an early age moved with his parents to the San Joaquin Valley in California where he worked as a ranchhand. After a failed attempt to enlist in the Great War in 1917 and with the help of Mark Twain's sister he met Robert Hobart Davis, editor of All-Story Weekly and became a regular contributor writting under his most used pseudonym “Max Brandâ€. He wrote in many genres during his career and produced more than 300 western novels and stories. His most famous characters were Destry and Dr. Kildare, both of which were produced in film. Faust was killed in Italy in 1944 as a front line war correspondent at the age of 51. He is buried in the Sicily-Rome American Cemetery in Nettuno, Italy.

Ronicky Doone (1926) is a hero of the west, respected by the law-abiding citizen and hated by bushwhacking bandits. Bill Gregg is a man in love, not about to be deflected from meeting his lady love for the first time, and willing to stand up to the living legend to reach her. This initial meeting leads to a friendship between the two and they travel east to New York City on the trail of the girl. When they find the girl, Caroline Smith, and she refuses to leave, Ronicky must discover the secret that holds her. They encounter the sinister John Mark and the beautiful Ruth Tolliver and are exposed to the horrors and vices of big city life as they attempt to rescue Caroline and find their way back to the mountain-desert of the west. (Summary by Rowdy Delaney)