Perl打印重定向的URL

时间:2014-09-28 17:24:14

标签: perl redirect

我想在perl中打印重定向的网址。

输入网址:http://pricecheckindia.com/go/store/snapdeal/52517?ref=velusliv

输出网址:http://www.snapdeal.com/product/vox-2-in-1-camcorder/1154987704?utm_source=aff_prog&utm_campaign=afts&offer_id=17&aff_id=1298&source=pricecheckindia

use LWP::UserAgent qw();
use CGI qw(:all);
print header();
my ($url) = "http://pricecheckindia.com/go/store/snapdeal/52517?ref=velusliv";
my $ua = LWP::UserAgent->new;
my $req = new HTTP::Request(GET => $url);
my $res = $ua->request($req);
print $res->request;

如何在perl中完成此操作?

3 个答案:

答案 0 :(得分:2)

您需要检查HTTP response以查找网址。 HTTP::Response的文档提供了有关如何执行此操作的完整详细信息,但总而言之,您应该执行以下操作:

use strict;
use warnings;
use feature ':5.10'; # enables "say"
use LWP::UserAgent;
my $url = "http://pricecheckindia.com/go/store/snapdeal/52517?ref=velusliv";

my $ua = LWP::UserAgent->new;
my $req = new HTTP::Request(GET => $url);
my $res = $ua->request($req);

# you should add a check to ensure the response was actually successful:
if (! $res->is_success) {
   say "GET failed! " . $res->status_line;
}

# show the base URI for the response:
say "Base URI: " . $res->base;

您可以使用HTTP::Response的{​​{1}}方法查看重定向:

redirects

在这种情况下,基本URI与if ($res->redirects) { # are there any redirects? my @redirects = $res->redirects; say join(", ", @redirects); } else { say "No redirects."; } 相同,如果您检查页面内容,则可以看到原因。

$url

靠近页面底部,有以下代码:

# print out the contents of the response:
say $res->decoded_contents;

重定向由javascript处理,因此LWP :: UserAgent不会选择。如果您想获取此URL,则需要从响应内容中提取它(或使用支持javascript的其他客户端)。

另一方面,你的脚本就像这样开始:

    $(window).load(function() {
        window.setTimeout(function() {
            window.location = "http://www.snapdeal.com/product/vox-2-in-1-camcorder/1154987704?utm_source=aff_prog&utm_campaign=afts&offer_id=17&aff_id=1298&source=pricecheckindia"
        }, 300);
    });

模块名称use LWP::UserAgent qw(); 后面的代码用于将特定子例程导入到脚本中,以便您可以按名称使用它们(而不必引用模块名称和子例程名称)。如果qw()为空,则表示没有做任何事情,所以你可以省略它。

答案 1 :(得分:1)

要让LWP::UserAgent关注重定向,只需设置max_redirects选项:

use strict;
use warnings;

use LWP::UserAgent qw();

my $url = "http://pricecheckindia.com/go/store/snapdeal/52517?ref=velusliv";

my $ua = LWP::UserAgent->new( max_redirect => 5 );

my $res = $ua->get($url);

if ( $res->is_success ) {
    print $res->decoded_content;    # or whatever
} else {
    die $res->status_line;
}

但是,该网站正在使用JavaScript重定向。

    $(window).load(function() {
        window.setTimeout(function() {
            window.location = "http://www.snapdeal.com/product/vox-2-in-1-camcorder/1154987704?utm_source=aff_prog&utm_campaign=afts&offer_id=17&aff_id=1298&source=pricecheckindia"
        }, 300);
    });

除非您使用支持JavaScript的框架,例如WWW::Mechanize::Firefox

,否则无效

答案 2 :(得分:0)

最后一行$ res - >会给你一个错误请求,因为它返回响应中的哈希和内容。以下是代码:

use LWP::UserAgent qw();
use CGI qw(:all);
print header();
my ($url) = "http://pricecheckindia.com/go/store/snapdeal/52517?ref=velusliv";
my $ua = LWP::UserAgent->new;
my $req = new HTTP::Request(GET => $url);
my $res = $ua->request($req);
print $res->content;