我想在perl中打印重定向的网址。
输入网址:http://pricecheckindia.com/go/store/snapdeal/52517?ref=velusliv
输出网址:http://www.snapdeal.com/product/vox-2-in-1-camcorder/1154987704?utm_source=aff_prog&utm_campaign=afts&offer_id=17&aff_id=1298&source=pricecheckindia
use LWP::UserAgent qw();
use CGI qw(:all);
print header();
my ($url) = "http://pricecheckindia.com/go/store/snapdeal/52517?ref=velusliv";
my $ua = LWP::UserAgent->new;
my $req = new HTTP::Request(GET => $url);
my $res = $ua->request($req);
print $res->request;
如何在perl中完成此操作?
答案 0 :(得分:2)
您需要检查HTTP response以查找网址。 HTTP::Response
的文档提供了有关如何执行此操作的完整详细信息,但总而言之,您应该执行以下操作:
use strict;
use warnings;
use feature ':5.10'; # enables "say"
use LWP::UserAgent;
my $url = "http://pricecheckindia.com/go/store/snapdeal/52517?ref=velusliv";
my $ua = LWP::UserAgent->new;
my $req = new HTTP::Request(GET => $url);
my $res = $ua->request($req);
# you should add a check to ensure the response was actually successful:
if (! $res->is_success) {
say "GET failed! " . $res->status_line;
}
# show the base URI for the response:
say "Base URI: " . $res->base;
您可以使用HTTP::Response
的{{1}}方法查看重定向:
redirects
在这种情况下,基本URI与if ($res->redirects) { # are there any redirects?
my @redirects = $res->redirects;
say join(", ", @redirects);
}
else {
say "No redirects.";
}
相同,如果您检查页面内容,则可以看到原因。
$url
靠近页面底部,有以下代码:
# print out the contents of the response:
say $res->decoded_contents;
重定向由javascript处理,因此LWP :: UserAgent不会选择。如果您想获取此URL,则需要从响应内容中提取它(或使用支持javascript的其他客户端)。
另一方面,你的脚本就像这样开始:
$(window).load(function() {
window.setTimeout(function() {
window.location = "http://www.snapdeal.com/product/vox-2-in-1-camcorder/1154987704?utm_source=aff_prog&utm_campaign=afts&offer_id=17&aff_id=1298&source=pricecheckindia"
}, 300);
});
模块名称use LWP::UserAgent qw();
后面的代码用于将特定子例程导入到脚本中,以便您可以按名称使用它们(而不必引用模块名称和子例程名称)。如果qw()
为空,则表示没有做任何事情,所以你可以省略它。
答案 1 :(得分:1)
要让LWP::UserAgent
关注重定向,只需设置max_redirects
选项:
use strict;
use warnings;
use LWP::UserAgent qw();
my $url = "http://pricecheckindia.com/go/store/snapdeal/52517?ref=velusliv";
my $ua = LWP::UserAgent->new( max_redirect => 5 );
my $res = $ua->get($url);
if ( $res->is_success ) {
print $res->decoded_content; # or whatever
} else {
die $res->status_line;
}
但是,该网站正在使用JavaScript重定向。
$(window).load(function() {
window.setTimeout(function() {
window.location = "http://www.snapdeal.com/product/vox-2-in-1-camcorder/1154987704?utm_source=aff_prog&utm_campaign=afts&offer_id=17&aff_id=1298&source=pricecheckindia"
}, 300);
});
除非您使用支持JavaScript的框架,例如WWW::Mechanize::Firefox
。
答案 2 :(得分:0)
最后一行$ res - >会给你一个错误请求,因为它返回响应中的哈希和内容。以下是代码:
use LWP::UserAgent qw();
use CGI qw(:all);
print header();
my ($url) = "http://pricecheckindia.com/go/store/snapdeal/52517?ref=velusliv";
my $ua = LWP::UserAgent->new;
my $req = new HTTP::Request(GET => $url);
my $res = $ua->request($req);
print $res->content;