perl在机械化:: POST后检索页面详细信息

时间:2014-03-26 02:11:09

标签: perl http-post www-mechanize lwp-useragent

我正在尝试从网站收集数据。一些反模式使得寻找正确的形式对象变得困难,但我已经解决了这个问题。我正在使用post方法来解决一些javascript充当提交表单的包装器。我的问题似乎是从mechanize-> post方法获得结果。

这是我的代码的缩短版本。

use strict;
use warnings;
use HTML::Tree;
use LWP::Simple;
use WWW::Mechanize;
use HTTP::Request::Common;
use Data::Dumper;
$| = 1;

my $site_url = "http://someURL"; 
my $mech = WWW::Mechanize->new( autocheck => 1 );
foreach my $number (@numbers) 
{
    my $content = get($site_url);
       $mech->get ($site_url);

    my $tree = HTML::Tree->new();

    $tree->parse($content);

    my ($title) = $tree->look_down( '_tag' , 'a' );
    my $atag = "";
    my $atag1 = "";
    foreach $atag ( $tree->look_down( _tag => q{a}, 'class' => 'button', 'title' => 'SEARCH'     )  ) 
    {
        print "Tag is ", $atag->attr('id'), "\n";
        $atag1 = Dumper $atag->attr('id');
    }

# Enter permit number in "Number" search field
    my @forms = $mech->forms;
    my @fields = ();
    foreach my $form (@forms)
    {
        @fields = $form->param;
    }
    my ($name, $fnumber) = $fields[2];
    print "field name and number is $name\n";
    $mech->field( $name, $number, $fnumber );
    print "field $name populated with search data $number\n" if $mech->success();

    $mech->post($site_url , 
    [
       '$atag1' => $number,
       'internal.wdk.wdkCommand' => $atag1,
    ]) ;

print $mech->content; # I think this is where the problem is.

}

我从最终的print语句中得到的数据是来自原始URL的数据,而不是POST命令应该带我到的页面。我做错了什么?

非常感谢

更新

我没有安装Firefox,所以我故意避免WWW::Mechanize::Firefox

1 个答案:

答案 0 :(得分:1)

原来我从POST命令中排除了一些必需的隐藏字段。