Perl WWW :: Mechanize JSESSION问题

时间:2011-01-13 16:50:06

标签: perl screen-scraping mechanize www-mechanize jsessionid

我在使用perl mechanize登录网站

时遇到问题

查看标题,看来JSESSIONID不断变化。我正在使用一个饼干罐,但我认为它会以某种方式被覆盖。

#!/usr/bin/perl
use strict;
use warnings;
use WWW::Mechanize;
use HTTP::Cookies;
use Crypt::SSLeay;

use LWP::UserAgent;
use Crypt::SSLeay::CTX;
use Crypt::SSLeay::Conn;
use Crypt::SSLeay::X509;

use LWP::Simple qw(get);
use LWP::Debug;

my $cookie_jar = HTTP::Cookies->new(ignore_discard => 1);
my $agent = WWW::Mechanize->new(cookie_jar => $cookie_jar, noproxy=>0);
$agent->agent_alias('Linux Mozilla');

$ENV{HTTPS_CA_DIR} = 'cert/';

my $user = 'xxxx';
my $pass = 'xxxx';

my $url = '';

print "\n\n=========================================================\nGOING TO LOGIN PAGE:\n";
my $res = $agent->get($url);

for my $key ( $res->header_field_names() ) {
    print $key, " : ", $res->header( $key ), "\n";
}
print "cookie: ".$agent->cookie_jar->as_string();
$agent->form_name('loginForm');
$agent->set_fields(
    userId => $user,
    password => $pass
);    
$agent->submit();


print "\n\n=========================================================\nREDIRECT:\n";
my $res = $agent->submit();

for my $key ( $res->header_field_names() ) {
    print $key, " : ", $res->header( $key ), "\n";
}
print "cookie: ".$agent->cookie_jar->as_string();   


my $cUrl = '';
$cookie_jar->revert;

print "\n\n=========================================================\nGOING TO CAMPAIGN PAGE:\n";
my $res = $agent->get($cUrl);

for my $key ( $res->header_field_names() ) {
    print $key, " : ", $res->header( $key ), "\n";
}
print "cookie: ".$agent->cookie_jar->as_string();

1 个答案:

答案 0 :(得分:0)

我不确定为什么会这样,但我能够通过利用LWP :: ConnCache来解决这个问题

$agent->conn_cache(LWP::ConnCache->new());