我似乎无法使用WWW::Mechanize
来使用此脚本。
我知道这可能很简单,但我看不到它。
我认为由于某种原因它在HTML::TokeParser
失败了。
我收到此错误消息
Can't call method "get_token" on an undefined value at Untitled line 13
#!/usr/bin/perl
print "Content-type: text/html\n\n";
use WWW::Mechanize;
my $url = "http://slashdot.org/";
my $agent = WWW::Mechanize->new( autocheck => 1 );
$agent->get($url);
my $stream = HTML::TokeParser->new( $agent->{content} );
while ( my $token = $stream->get_token ) {
my $ttype = shift @{$token};
if ( $ttype eq "S" ) {
my ( $tag, $attr, $attrseq, $rawtxt ) = @{$token};
if ( $tag eq "div" ) {
if ( $rawtxt =~ /id="text-/m ) {
print $stream->get_trimmed_text( $tag, "/div" );
print "\n\n\n\n";
}
}
}
}
答案 0 :(得分:0)
来自HTML::TokeParser的文档:
$p = HTML::TokeParser->new( \$document, %opt );
The object constructor argument is either a file name, a file handle object, or the complete document to be parsed. Extra options can be provided as key/value pairs and are processed as documented by the base classes.
If the argument is a plain scalar, then it is taken as the name of a file to be opened and parsed. If the file can't be opened for reading, then the constructor will return undef and $! will tell you why it failed.
从你的剧本:
Can't call method "get_token" on an undefined value at Untitled line 13
检查您传递的参数以初始化HTML :: TokeParser对象:
my $stream = HTML::TokeParser->new($agent->{content});
首先,您应该使用WWW::Mechanize's content
方法来获取页面内容,其次,您需要传入对内容的引用,而不是内容本身。要更正代码,您需要
my $stream = HTML::TokeParser->new( \$agent->content );
您可能还想添加错误检查,以确保在启动解析器之前成功检索slashdot页面(例如,使用$agent->success
)。