Question

我需要抓取一个网站并检索每隔几分钟不断更新的某些数据。我该怎么做？

Answer 1

加载WWW::Mechanize进行抓取，请使用mirror method inherited from LWP::UserAgent。

Answer 2

使用sleep控制等待时间，并使用WWW::Mechanize进行数据检索：

use strict;
use WWW::Mechanize;

my $mech = WWW::Mechanize->new();
my $url = "http://www.nytimes.com";  # a sample webpage
while (1) {
    $mech->get($url);
    print $mech->content(format => 'text');  # read docs for WWW::Mechanize for advanced content processing
    sleep 300;  # wait for 5 minutes
}

编辑：改进了样本内容检索过程。

每隔一段时间抓取一个网站以获取数据

2 个答案: