如何确定某个网址是否为ZIP但是没有先下载整个网址,因为它可能太大了?我可以以某种方式获得几个字节并检查ZIP标题吗?
答案 0 :(得分:1)
我调整了this answer的代码,而不是从响应中读取4个字节(使用Range,或者在读取4个字节后中止),然后查看4个字节是否与zip魔术头匹配。
尝试一下,让我知道结果。您可能希望添加一些错误检查,以查看如果curl请求因某种原因而失败,则无法确定文件的类型。
<?php
/**
* Try to determine if a remote file is a zip by making an HTTP request for
* a byte range or aborting the transfer after reading 4 bytes.
*
* @return bool true if the remote file is a zip, false otherwise
*/
function isRemoteFileZip($url)
{
$ch = curl_init($url);
$headers = array(
'Range: bytes=0-4',
'Connection: close',
);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2450.0 Iron/46.0.2450.0');
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($ch, CURLOPT_VERBOSE, 0); // set to 1 to debug
curl_setopt($ch, CURLOPT_STDERR, fopen('php://output', 'r'));
$header = '';
// write function that receives data from the response
// aborts the transfer after reading 4 bytes of data
curl_setopt($ch, CURLOPT_WRITEFUNCTION, function($curl, $data) use(&$header) {
$header .= $data;
if (strlen($header) < 4) return strlen($data);
return 0; // abort transfer
});
$result = curl_exec($ch);
$info = curl_getinfo($ch);
// check for the zip magic header, return true if match, false otherwise
return preg_match('/^PK(?:\x03\x04|\x05\x06|0x07\x08)/', $header);
}
var_dump(isRemoteFileZip('https://example.com/file.zip'));
var_dump(isRemoteFileZip('https://example.com/logo.png'));