Question

Usualy我在Ruby中制作了刮刀，但决定用Perl做。当我运行我的脚本时，我看到打开的网址数非常非常慢。我感谢，也许是重定向问题？或者也许它的JS网址就是问题所在。我决定使用一些可以打开JS网站的模块。所以我期待cpan doc，接受代码并尝试运行它。没有什么内容。我做错了什么？请指正。或者也许建议我。我尝试使用Selenium但安装有问题，当我尝试在Linux控制台中运行selenium时看到错误。

use WWW::Scripter;

  $w = new WWW::Scripter;  
  $w->use_plugin('JavaScript');

   open(FH, "<links.csv");
   while (<FH>) {
    $url =  $_;

    if ( $url !~ /http(s)/ ) {
        $url = "http://".$url;
    }

    $w->get(url);
    $html = $w->content;

    print "=======\n";
    print Dupmper $w->content;
    print "=======\n";
}

Answer 1

$w->get(url);

不是url，而是$url。使用strict和warnings。

Answer 2

首先，您应该始终在Perl程序中use strict和use warnings。他们会收拾你的错字。

其次，你应该检查get()的返回代码，因为这会显示你出错了。

第三，您的代码中很少有过时的Perl编程实践。

# Always use these
use strict;
use warnings;

use WWW::Scripter;
# Added this
use Data::Dumper;

# Don't use indirect object notation.
# Declare variable
my $w = WWW::Scripter->new;
$w->use_plugin('JavaScript');

# Three-arg version of open()
# Lexical filehandle
# Check result of open() and die on failure
open(my $in_fh, '<', 'links.csv') or die $!;

while (<$in_fh>) {
  # Fixed regex.
  # 1/ Anchored at start of string
  # 2/ Made 's' optional (and non-captured)
  if ( ! /^https?/ ) {
    # Use string interpolation
    $url = "http://$url";
  }

  # Capture HTTP response
  # Use '$url', not 'url'
  my $resp = $w->get($url);

  # Check request is successful
  unless ($resp->is_success) {
    # If not, warn and die
    warn $resp->status_line;
    next;
  }

  print "=======\n";
  print Dumper $w->content;
  print "=======\n";
}

PERL WWW :: Scripter不起作用，不返回内容

2 个答案: