我正在从URL下载ZIP,但我对此有疑问。我算法的第一步是检查给定网址的Content-Type
和Content-Length
是什么:
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "https://www.dropbox.com/s/0hvgw7nvbdnh13d/ColaClassic.zip");
curl_setopt($ch, CURLOPT_HEADER, 1); //I
curl_setopt($ch, CURLOPT_NOBODY, 1); //without body
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); //L
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_exec($ch);
$content_type = curl_getinfo($ch, CURLINFO_CONTENT_TYPE);
但是,变量$content-type
的值为text/html; charset=utf-8
然后我从命令行中像这样检查Content-Type
:
curl -IL https://www.dropbox.com/s/0hvgw7nvbdnh13d/ColaClassic.zip
我得到正确的结果(application/zip
)。
那么,这两个代码有什么区别,如何在我的php脚本中获得正确的Content-Type
?
编辑:
curl_setopt($ch, CURLOPT_URL, 'https://www.dropbox.com/s/0hvgw7nvbdnh13d/ColaClassic.zip');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'HEAD');
curl_setopt($ch, CURLOPT_NOBODY, true);
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_VERBOSE, true);
curl_setopt($ch, CURLOPT_STDERR, $verbose);
curl_setopt($ch, CURLOPT_BINARYTRANSFER, true);
php curl的详细输出:
* Hostname was found in DNS cache
* Hostname in DNS cache was stale, zapped
* Trying 162.125.69.1...
* Connected to www.dropbox.com (162.125.69.1) port 443 (#14)
* successfully set certificate verify locations:
* CAfile: none
CApath: /etc/ssl/certs
* SSL connection using ECDHE-RSA-AES128-GCM-SHA256
* Server certificate:
* subject: businessCategory=Private Organization; 1.3.6.1.4.1.311.60.2.1.3=US; 1.3.6.1.4.1.311.60.2.1.2=Delaware; serialNumber=4348296; C=US; ST=California; L=San Francisco; O=Dropbox, Inc; CN=www.dropbox.com
* start date: 2017-11-14 00:00:00 GMT
* expire date: 2020-02-11 12:00:00 GMT
* subjectAltName: www.dropbox.com matched
* issuer: C=US; O=DigiCert Inc; OU=www.digicert.com; CN=DigiCert SHA2 Extended Validation Server CA
* SSL certificate verify ok.
> HEAD /s/0hvgw7nvbdnh13d/ColaClassic.zip HTTP/1.1
Host: www.dropbox.com
Accept: */*
cmdline curl的详细输出:
* Trying 162.125.69.1...
* TCP_NODELAY set
* Connected to www.dropbox.com (162.125.69.1) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
* CAfile: /etc/ssl/cert.pem
CApath: none
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS change cipher, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-CHACHA20-POLY1305
* ALPN, server accepted to use h2
* Server certificate:
* subject: businessCategory=Private Organization; jurisdictionCountryName=US; jurisdictionStateOrProvinceName=Delaware; serialNumber=4348296; C=US; ST=California; L=San Francisco; O=Dropbox, Inc; CN=www.dropbox.com
* start date: Nov 14 00:00:00 2017 GMT
* expire date: Feb 11 12:00:00 2020 GMT
* subjectAltName: host "www.dropbox.com" matched cert's "www.dropbox.com"
* issuer: C=US; O=DigiCert Inc; OU=www.digicert.com; CN=DigiCert SHA2 Extended Validation Server CA
* SSL certificate verify ok.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x7fd8c4007a00)
> HEAD /s/0hvgw7nvbdnh13d/ColaClassic.zip HTTP/2
> Host: www.dropbox.com
> User-Agent: curl/7.54.0
> Accept: */*
答案 0 :(得分:1)
似乎保管箱根据用户代理(或缺少代理)发出不同的响应代码。您的命令行操作发送类似temp = {}
for name, value in zip(['first', 'second', 'third'], list_lpn_temp):
temp[name+'_temp_lpn'] = value[0]
temp[name+'_temp_lpn_validated'] = value[1]
df2 = df2.append(temp)
(或您的版本)的信息,而php脚本发送空的用户代理。将用户代理添加到您的php请求中将使保管箱以curl/7.47.0
响应适当地进行响应,然后您的脚本将按照预期的位置跟随:
HTTP/1.1 301 Moved Permanently
更新:奇怪的是,我只是尝试了其他一些事情,例如模拟各种浏览器用户代理字符串,并且当dropbox与$ch = curl_init();
// emulates user agent from command line.
$user_agent = 'curl/' . curl_version()['version'];
curl_setopt($ch, CURLOPT_URL, "https://www.dropbox.com/s/0hvgw7nvbdnh13d/ColaClassic.zip");
curl_setopt($ch, CURLOPT_HEADER, 1); //I
curl_setopt($ch, CURLOPT_NOBODY, 1); //without body
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); //L
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_USERAGENT, $user_agent);
curl_exec($ch);
$content_type = curl_getinfo($ch, CURLINFO_CONTENT_TYPE);
echo $content_type;
用户代理一起呈现时,似乎dropbox只会发出重定向。 curl/X.X.X