Question

hello目前我能够解析xml文件，如果它从网页保存在我的文件夹中。

use strict;
use warnings;
use Data::Dumper;
use XML::Simple;

my $parser = new XML::Simple;
my $data = $parser->XMLin("config.xml");
print Dumper($data);

但如果我试图从网站上解析它，它就不起作用。

use strict;
use warnings;
use Data::Dumper;
use XML::Simple;

my $parser = new XML::Simple;
my $data = $parser->XMLin("http://website/computers/computers_main/config.xml");
print Dumper($data);

它给了我以下错误“文件不存在：http://website/computers/computers_main/config.xml at test.pl第12行”

如何从网页解析多个xml文件？我必须从网站上抓取多个xml并解析它。有人可以帮我这个吗？

Answer 1

阅读XML::Simple的文档。请注意，XMLin方法可以使用文件句柄，字符串甚至是IO::Handle对象。它不能采用的是通过HTTP的URL。

使用Perl模块LWP::Simple获取所需的XML文件，并将其传递给XMLin。

您必须使用cpan下载并安装LWP::Simple，就像之前XML::Simple所做的那样。

Answer 2

超级编辑：此方法需要WWW :: Mechanize，但它允许您登录到您的网站，然后获取xml页面。您将不得不更改注释中的一些内容。希望这可以帮助。

use strict;
use warnings;
use Data::Dumper;
use XML::Simple;
use WWW::Mechanize;

# Create a new instance of Mechanize
$bot = WWW::Mechanize->new();
# Create a cookie jar for the login credentials
$bot->cookie_jar(
        HTTP::Cookies->new(
            file           => "cookies.txt",
            autosave       => 1,
            ignore_discard => 1,
    )
);
# Connect to the login page
$response = $bot->get( 'http://www.thePageYouLoginTo.com' );
# Get the login form
$bot->form_number(1);
# Enter the login credentials.
# You're going to have to change the login and 
# pass(on the left) to match with the name of the form you're logging
# into(Found in the source of the website). Then you can put your 
# respective credentials on the right.
$bot->field( login => 'thisIsWhereYourLoginInfoGoes' );
$bot->field( pass => 'thisIsWhereYourPasswordInfoGoes' );
$response =$bot->click();
# Get the xml page
$response = $bot->get( 'http://website/computers/computers_main/config.xml' );
my $content = $response->decoded_content();
my $parser = new XML::Simple;
my $data = $parser->XMLin($content);
print Dumper($data);

放手一搏。如上所述，使用LWP :: Simple。它只是连接到页面并抓取该页面的内容（xml文件）并通过XMLin运行。 修改：在get $ url行添加了简单的错误检查。 编辑2：将代码保留在此处，因为如果不需要登录则该代码可以正常工作。

use strict;
use warnings;
use Data::Dumper;
use XML::Simple;
use LWP::Simple;

my $parser = new XML::Simple;

my $url = 'http://website/computers/computers_main/config.xml';
my $content = get $url or die "Unable to get $url\n";
my $data = $parser->XMLin($content);

print Dumper($data);

Answer 3

如果您没有任何具体理由坚持使用XML :: Simple，那么请使用其他解析器，如XML :: Twig，XML :: LibXML，它提供内置功能来解析通过Web提供的XML。

以下是使用XML :: Twig

的简单代码

use strict;
use warnings;
use XML::Twig;
use LWP::Simple;

my $url = 'http://website/computers/computers_main/config.xml';
my $twig= XML::Twig->new();
$twig->parse( LWP::Simple::get( $url ));

如上所述，XML :: Simple没有这样的内置功能。</ p>

我如何解析perl中的xml网页

3 个答案: