是否有一个Perl工具,其行为类似于wget --continue
并且能够继续获取部分下载的文件?
答案 0 :(得分:2)
AnyEvent::HTTP
的文档包含使用HTTP 1.1恢复下载的能力的代码。我从来没用过它,所以我不能评论它的适用性。
显然,该示例希望您已经知道如何使用AnyEvent
,当然,我不知道。您需要修改代码以使其具有预期已存在的事件循环:
#!/usr/bin/perl
use strict;
use warnings;
use AnyEvent::HTTP;
my $url = "http://localhost/foo.txt";
my $file = "foo.txt";
sub download {
my ($url, $file, $cb) = @_;
open my $fh, "+>>:raw", $file
or die "could not open $file: $!";
my %hdr;
my $ofs = 0;
if (stat $fh and $ofs = -s _) {
$hdr{"if-unmodified-since"} = AnyEvent::HTTP::format_date((stat _)[9]);
$hdr{"range"} = "bytes=$ofs-";
}
http_get $url, (
headers => \%hdr,
on_header => sub {
my ($hdr) = @_;
if ($hdr->{Status} == 200 && $ofs) {
# resume failed
truncate $fh, $ofs = 0;
}
sysseek $fh, $ofs, 0;
return 1;
},
on_body => sub {
my ($data, $hdr) = @_;
if ($hdr->{Status} =~ /^2/) {
length $data == syswrite $fh, $data
or return; # abort on write errors
}
return 1;
},
sub {
my (undef, $hdr) = @_;
my $status = $hdr->{Status};
if (my $time = AnyEvent::HTTP::parse_date $hdr->{"last-modified"}) {
utime $fh, $time, $time;
}
if ($status == 200 || $status == 206 || $status == 416) {
# download ok || resume ok || file already fully downloaded
$cb->(1, $hdr);
} elsif ($status == 412) {
# file has changed while resuming, delete and retry
unlink $file;
$cb->(0, $hdr);
} elsif ($status == 500 or $status == 503 or $status =~ /^59/) {
# retry later
$cb->(0, $hdr);
} else {
$cb->(undef, $hdr);
}
}
);
}
my $quit = AnyEvent->condvar; #create a handle to exit the loop
download $url, $file, sub {
if ($_[0]) {
print "OK!\n";
} elsif (defined $_[0]) {
print "please retry later\n";
} else {
print "ERROR\n";
}
$quit->send; #quit the loop
};
$quit->recv; #start the loop
使其成功的关键是$quit
条件变量:
my $quit = AnyEvent->condvar; #handle to exit the loop
.
.
.
$quit->recv;
这会设置一个事件循环。如果没有事件循环,程序将在调用http_get
之前有机会除了创建文件之外执行任何操作。要退出事件循环,我们会在$quit->send
函数的回调中调用download
。
答案 1 :(得分:1)
我在Google的帮助下找到了这一点(需要特定的搜索字词,而不是说GIYF)。 Link to Google Translate
他们的示例代码是
use strict;
use LWP::UserAgent;
my $u = "http://www.mangafox.com/media/manga.banner.png";
my $f = "tmp.jpg";
my $ua = LWP::UserAgent->new();
## Uncomment for test
# unlink $f;
# system("cp tmp.jpg tmp0.jpg");
# system( "head -c 10000 tmp0.jpg > tmp.jpg" );
download($u, $f);
sub download {
my ($url, $file) = @_;
my ($tries, @parameters, $FD);
@parameters = ( $url,
":content_cb" => sub { my ($chunk) = @_;
print $FD $chunk;
}
);
$tries = 4;
while ( $tries ) {
open($FD, ">>$file") || die "ERROR: $!";
my $bytes = -s $file;
if ( $bytes > 0 ) { push(@parameters, "Range" => "bytes=$bytes-" ) }
my $res =$ua->get( @parameters );
print $res->status_line . "\n";
close $FD;
# 416 Requested Range Not Satisfiable
# (file already fully downloaded)
if ( $res->is_success || $res->code == 416 ) { return }
$tries --;
}
die "ERROR: download $url";
}
# Test with ImageMagick
system("identify tmp.jpg");
我不是说这有效或适合你。只是我发现的东西。使用风险自负。
答案 2 :(得分:1)
尝试过(2次)并且有效。
#!/usr/local/bin/perl
use warnings;
use 5.014;
use utf8;
use LWP::UserAgent;
use File::Basename;
my $url = 'http://server/somelargefile';
my $file = basename $url;
my $ua = LWP::UserAgent->new( show_progress => 1 );
open my $fh, '>>:raw', $file or die $!;
my $bytes = -s $file;
my $res;
if ( $bytes ) {
say "resume download: $file ($bytes)";
$res = $ua->get(
$url,
'Range' => "bytes=$bytes-",
':content_cb' => sub { my ( $chunk ) = @_; print $fh $chunk; }
);
} else {
say "start download";
$res = $ua->get(
$url,
':content_cb' => sub { my ( $chunk ) = @_; print $fh $chunk; }
);
}
close $fh;
my $status = $res->status_line;
if ( $status =~ /^(200|206|416)/ ) {
say "OK" if $status =~ /^20[06]/;
say "$file already downloaded" if $status =~ /^416/;
} else {
say $status;
}