我正在尝试从网站收集数据。一些反模式使得寻找正确的形式对象变得困难,但我已经解决了这个问题。我正在使用post方法来解决一些javascript充当提交表单的包装器。我的问题似乎是从mechanize-> post方法获得结果。
这是我的代码的缩短版本。
use strict;
use warnings;
use HTML::Tree;
use LWP::Simple;
use WWW::Mechanize;
use HTTP::Request::Common;
use Data::Dumper;
$| = 1;
my $site_url = "http://someURL";
my $mech = WWW::Mechanize->new( autocheck => 1 );
foreach my $number (@numbers)
{
my $content = get($site_url);
$mech->get ($site_url);
my $tree = HTML::Tree->new();
$tree->parse($content);
my ($title) = $tree->look_down( '_tag' , 'a' );
my $atag = "";
my $atag1 = "";
foreach $atag ( $tree->look_down( _tag => q{a}, 'class' => 'button', 'title' => 'SEARCH' ) )
{
print "Tag is ", $atag->attr('id'), "\n";
$atag1 = Dumper $atag->attr('id');
}
# Enter permit number in "Number" search field
my @forms = $mech->forms;
my @fields = ();
foreach my $form (@forms)
{
@fields = $form->param;
}
my ($name, $fnumber) = $fields[2];
print "field name and number is $name\n";
$mech->field( $name, $number, $fnumber );
print "field $name populated with search data $number\n" if $mech->success();
$mech->post($site_url ,
[
'$atag1' => $number,
'internal.wdk.wdkCommand' => $atag1,
]) ;
print $mech->content; # I think this is where the problem is.
}
我从最终的print语句中得到的数据是来自原始URL的数据,而不是POST命令应该带我到的页面。我做错了什么?
非常感谢
更新
我没有安装Firefox,所以我故意避免WWW::Mechanize::Firefox
。
答案 0 :(得分:1)
原来我从POST命令中排除了一些必需的隐藏字段。