我正在尝试使用从服务器收到的实际文件名保存Google云端硬盘共享文件(内容处理):
我试图分析标题:
use strict;
use warnings;
use Data::Dumper;
use LWP::UserAgent qw( );
my $str = 'https://drive.google.com/file/d/0B6vqTWO9kmdmdzk5ejhDSXgzMDg/view?usp=sharing';
$str =~ /file\/d\/(\w+)/;
my $url = 'https://drive.google.com/uc?export=download&id='.$1;
my $ua = LWP::UserAgent->new();
my $response = $ua->head($url)->{'_headers'};
print Dumper( $response );
我明白了:
$VAR1 = bless( {
'client-ssl-cert-subject' => '/C=US/ST=California/L=Mountain View/O=Google Inc/CN=*.googleusercontent.com',
'connection' => 'close',
'date' => 'Mon, 19 Oct 2015 14:45:45 GMT',
'content-type' => 'text/html; charset=UTF-8',
'x-guploader-uploadid' => 'AEnB2UrQJIoJUIIhWnKz9HAlW_2XKApLe_0IDMZjS0gGQOMdRaF68Od2xsxssp7mBdQP9kNrjvDueWUP5pSa1eHbprSjbPvfbA',
'alternate-protocol' => '443:quic,p=1',
'expires' => 'Mon, 19 Oct 2015 14:45:45 GMT',
'::std_case' => {
'client-ssl-socket-class' => 'Client-SSL-Socket-Class',
'client-ssl-cert-subject' => 'Client-SSL-Cert-Subject',
'client-ssl-cipher' => 'Client-SSL-Cipher',
'client-peer' => 'Client-Peer',
'x-guploader-uploadid' => 'X-GUploader-UploadID',
'alternate-protocol' => 'Alternate-Protocol',
'alt-svc' => 'Alt-Svc',
'client-ssl-cert-issuer' => 'Client-SSL-Cert-Issuer',
'client-date' => 'Client-Date',
'client-response-num' => 'Client-Response-Num'
},
'client-ssl-cert-issuer' => '/C=US/O=Google Inc/CN=Google Internet Authority G2',
'server' => 'UploadServer',
'client-date' => 'Mon, 19 Oct 2015 14:45:55 GMT',
'client-ssl-socket-class' => 'IO::Socket::SSL',
'client-ssl-cipher' => 'ECDHE-ECDSA-AES128-GCM-SHA256',
'client-peer' => '2a00:1450:400c:c08::84:443',
'alt-svc' => 'quic=":443"; p="1"; ma=604800',
'cache-control' => 'private, max-age=0',
'client-response-num' => 1
}, 'HTTP::Headers' );
我希望在上面找到content-disposition标头。 另一方面,wget正确地给出了文件名:
#wget --content-disposition "https://drive.google.com/uc?export=download&id=0B6vqTWO9kmdmdzk5ejhDSXgzMDg"
--2015-10-19 20:21:37-- https://drive.google.com/uc?export=download&id=0B6vqTWO9kmdmdzk5ejhDSXgzMDg
Resolving drive.google.com (drive.google.com)... 2a00:1450:400c:c04::64, 74.125.206.139, 74.125.206.101, ...
Connecting to drive.google.com (drive.google.com)|2a00:1450:400c:c04::64|:443... connected.
HTTP request sent, awaiting response... 302 Moved Temporarily
Location: https://doc-0s-2s-docs.googleusercontent.com/docs/securesc/ha0ro937gcuc7l7deffksulhg5h7mbp1/unidk5uvfpl9kut1rl3hb5lqcvis8vdq/1445263200000/06380472059566149580/*/0B6vqTWO9kmdmdzk5ejhDSXgzMDg?e=download [following]
Warning: wildcards not supported in HTTP.
--2015-10-19 20:21:37-- https://doc-0s-2s-docs.googleusercontent.com/docs/securesc/ha0ro937gcuc7l7deffksulhg5h7mbp1/unidk5uvfpl9kut1rl3hb5lqcvis8vdq/1445263200000/06380472059566149580/*/0B6vqTWO9kmdmdzk5ejhDSXgzMDg?e=download
Resolving doc-0s-2s-docs.googleusercontent.com (doc-0s-2s-docs.googleusercontent.com)... 2a00:1450:400c:c08::84, 74.125.140.132
Connecting to doc-0s-2s-docs.googleusercontent.com (doc-0s-2s-docs.googleusercontent.com)|2a00:1450:400c:c08::84|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 16 [text/plain]
Saving to: ‘testdoc.txttestdoc.txt’
testdoc.txttestdoc.txt 100%[====================================================================================================================>] 16 --.-KB/s in 0s
2015-10-19 20:21:38 (550 KB/s) - ‘testdoc.txttestdoc.txt’ saved [16/16]
如何使用perl从服务器获取正确的文件名?
答案 0 :(得分:3)
不同之处在于你没有使用wget进行HEAD
。如果您在Perl代码中查看响应的状态,那么
503 Service Unavailable
可以引用很多东西,但在这种情况下意味着不支持HEAD
。更改GET
并且一切正常
use strict;
use warnings;
use feature 'say';
use LWP::UserAgent;
my $str = 'https://drive.google.com/file/d/0B6vqTWO9kmdmdzk5ejhDSXgzMDg/view?usp=sharing';
die unless $str =~ m{file/d/(\w+)};
my $url = "https://drive.google.com/uc?export=download&id=$1";
my $ua = LWP::UserAgent->new;
my $resp = $ua->get($url);
say $resp->status_line;
say $resp->header('Content-Disposition');
200 OK
attachment;filename="testdoc.txt";filename*=UTF-8''testdoc.txt
答案 1 :(得分:0)
差异的原因是:
HTTP request sent, awaiting response... 302 Moved Temporarily
Location: https://doc-0s-2s-docs.googleusercontent.com/docs/securesc/ha0ro937gcuc7l7deffksulhg5h7mbp1/unidk5uvfpl9kut1rl3hb5lqcvis8vdq/1445263200000/06380472059566149580/*/0B6vqTWO9kmdmdzk5ejhDSXgzMDg?e=download [following]
您的初始网址已被重定向 - 因此,您要获取标题的内容不是正在下载的文件wget
。尝试阅读status_line
你的头部请求,看看你得到了什么。