如果标题设置为true,我如何使用Curl在PHP中下载文件?我还可以获取文件的文件名和扩展名吗?
示例PHP代码:
curl_setopt ($ch, CURLOPT_HEADER, 1);
$fp = fopen($strFilePath, 'w');
curl_setopt($ch, CURLOPT_FILE, $fp);
答案 0 :(得分:5)
使用PHP cURL下载文件或网页并将其保存到文件
<?php
/**
* Initialize the cURL session
*/
$ch = curl_init();
/**
* Set the URL of the page or file to download.
*/
curl_setopt($ch, CURLOPT_URL,
'http://news.google.com/news?hl=en&topic=t&output=rss');
/**
* Create a new file
*/
$fp = fopen('rss.xml', 'w');
/**
* Ask cURL to write the contents to a file
*/
curl_setopt($ch, CURLOPT_FILE, $fp);
/**
* Execute the cURL session
*/
curl_exec ($ch);
/**
* Close cURL session and file
*/
curl_close ($ch);
fclose($fp);
?>
答案 1 :(得分:2)
下面是一个使用类的完整示例。标题解析比它更精细,因为我正在为完整的层次结构标题存储奠定基础。
我只是注意到init()应该重置更多的变量,如果它想要重用更多的URL的实例,但这至少应该为你提供如何将文件下载到由文件发送的文件名的基础。服务器
<?php
/*
* vim: ts=4 sw=4 fdm=marker noet tw=78
*/
class curlDownloader
{
private $remoteFileName = NULL;
private $ch = NULL;
private $headers = array();
private $response = NULL;
private $fp = NULL;
private $debug = FALSE;
private $fileSize = 0;
const DEFAULT_FNAME = 'remote.out';
public function __construct($url)
{
$this->init($url);
}
public function toggleDebug()
{
$this->debug = !$this->debug;
}
public function init($url)
{
if( !$url )
throw new InvalidArgumentException("Need a URL");
$this->ch = curl_init();
curl_setopt($this->ch, CURLOPT_URL, $url);
curl_setopt($this->ch, CURLOPT_HEADERFUNCTION,
array($this, 'headerCallback'));
curl_setopt($this->ch, CURLOPT_WRITEFUNCTION,
array($this, 'bodyCallback'));
}
public function headerCallback($ch, $string)
{
$len = strlen($string);
if( !strstr($string, ':') )
{
$this->response = trim($string);
return $len;
}
list($name, $value) = explode(':', $string, 2);
if( strcasecmp($name, 'Content-Disposition') == 0 )
{
$parts = explode(';', $value);
if( count($parts) > 1 )
{
foreach($parts AS $crumb)
{
if( strstr($crumb, '=') )
{
list($pname, $pval) = explode('=', $crumb);
$pname = trim($pname);
if( strcasecmp($pname, 'filename') == 0 )
{
// Using basename to prevent path injection
// in malicious headers.
$this->remoteFileName = basename(
$this->unquote(trim($pval)));
$this->fp = fopen($this->remoteFileName, 'wb');
}
}
}
}
}
$this->headers[$name] = trim($value);
return $len;
}
public function bodyCallback($ch, $string)
{
if( !$this->fp )
{
trigger_error("No remote filename received, trying default",
E_USER_WARNING);
$this->remoteFileName = self::DEFAULT_FNAME;
$this->fp = fopen($this->remoteFileName, 'wb');
if( !$this->fp )
throw new RuntimeException("Can't open default filename");
}
$len = fwrite($this->fp, $string);
$this->fileSize += $len;
return $len;
}
public function download()
{
$retval = curl_exec($this->ch);
if( $this->debug )
var_dump($this->headers);
fclose($this->fp);
curl_close($this->ch);
return $this->fileSize;
}
public function getFileName() { return $this->remoteFileName; }
private function unquote($string)
{
return str_replace(array("'", '"'), '', $string);
}
}
$dl = new curlDownloader(
'https://dl.example.org/torrent/cool-movie/4358-hash/download.torrent'
);
$size = $dl->download();
printf("Downloaded %u bytes to %s\n", $size, $dl->getFileName());
?>
答案 2 :(得分:1)
我相信你现在已经找到了答案。但是,我想通过向服务器发送json请求来共享我的脚本,该服务器以二进制形式返回文件,然后即时下载。节省是没有必要的。希望它有所帮助!
注意:您可以避免将帖子数据转换为json。
<?php
// Username or E-mail
$login = 'username';
// Password
$password = 'password';
// API Request
$url = 'https://example.com/api';
// POST data
$data = array('someTask', 24);
// Convert POST data to json
$data_string = json_encode($data);
// initialize cURL
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,$url);
curl_setopt($ch, CURLOPT_HTTPAUTH, CURLAUTH_BASIC);
curl_setopt($ch, CURLOPT_USERPWD, "$login:$password");
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, "POST");
curl_setopt($ch, CURLOPT_POSTFIELDS, $data_string);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
// Execute cURL and store the response in a variable
$file = curl_exec($ch);
// Get the Header Size
$header_size = curl_getinfo($ch, CURLINFO_HEADER_SIZE);
// Get the Header from response
$header = substr($file, 0, $header_size);
// Get the Body from response
$body = substr($file, $header_size);
// Explode Header rows into an array
$header_items = explode("\n", $header);
// Close cURL handler
curl_close($ch);
// define new variable for the File name
$file_name = null;
// find the filname in the headers.
if(!preg_match('/filename="(.*?)"/', $header, $matches)){
// If filename not found do something...
echo "Unable to find filename.<br>Please check the Response Headers or Header parsing!";
exit();
} else {
// If filename was found assign the name to the variable above
$file_name = $matches[1];
}
// Check header response, if HTTP response is not 200, then display the error.
if(!preg_match('/200/', $header_items[0])){
echo '<pre>'.print_r($header_items[0], true).'</pre>';
exit();
} else {
// Check header response, if HTTP response is 200, then proceed further.
// Set the header for PHP to tell it, we would like to download a file
header('Content-Description: File Transfer');
header('Content-Type: application/octet-stream');
header('Content-Transfer-Encoding: binary');
header('Expires: 0');
header('Cache-Control: must-revalidate');
header('Pragma: public');
header('Content-Disposition: attachment; filename='.$file_name);
// Echo out the file, which then should trigger the download
echo $file;
exit;
}
?>
答案 3 :(得分:1)
当你说
如果标题设置为true?
我假设您的意思是if CURLOPT_HEADER is set to true
几种方法,我个人最喜欢使用CURLOPT_HEADERFUNCTION而不是CURLOPT_HEADER,但是严格来讲,这不能回答您的问题。如果出于某种原因绝对要使用CURLOPT_HEADER,则可以使用strpos()+ substr()将正文和标题分开,
例如:
<?php
declare(strict_types = 1);
$ch= curl_init();
curl_setopt_array($ch,array(
CURLOPT_URL=>'http://example.org',
CURLOPT_HEADER=>1,
CURLOPT_RETURNTRANSFER=>1
));
$response = curl_exec($ch);
$header_body_separator = "\r\n\r\n";
$header_body_separator_position = strpos($response, $header_body_separator);
$separator_found = true;
if($header_body_separator_position === false){
// no body is present?
$header_body_separator_position = strlen($response);
$separator_found = false;
}
$headers = substr($response,0, $header_body_separator_position);
$headers = trim($headers);
$headers = explode("\r\n",$headers);
$body = ($separator_found ? substr($response, $header_body_separator_position + strlen($header_body_separator)) : "");
var_export(["headers"=>$headers,"body"=>$body]);die();
给你
array (
'headers' =>
array (
0 => 'HTTP/1.1 200 OK',
1 => 'Age: 240690',
2 => 'Cache-Control: max-age=604800',
3 => 'Content-Type: text/html; charset=UTF-8',
4 => 'Date: Fri, 06 Nov 2020 09:47:18 GMT',
5 => 'Etag: "3147526947+ident"',
6 => 'Expires: Fri, 13 Nov 2020 09:47:18 GMT',
7 => 'Last-Modified: Thu, 17 Oct 2019 07:18:26 GMT',
8 => 'Server: ECS (nyb/1D20)',
9 => 'Vary: Accept-Encoding',
10 => 'X-Cache: HIT',
11 => 'Content-Length: 1256',
),
'body' => '<!doctype html>
<html>
<head>
<title>Example Domain</title>
(...capped)
但是我不推荐这种方法,我也不推荐使用CURLOPT_HEADER。相反,我建议使用CURLOPT_HEADERFUNCTION,例如:
<?php
declare(strict_types = 1);
$ch = curl_init();
$headers = [];
curl_setopt_array($ch, array(
CURLOPT_URL => 'http://example.org',
CURLOPT_HEADERFUNCTION => function ($ch, string $header) use (&$headers): int {
$header_trimmed = trim($header);
if (strlen($header_trimmed) > 0) {
$headers[] = $header_trimmed;
}
return strlen($header);
},
CURLOPT_RETURNTRANSFER => 1
));
$body = curl_exec($ch);
var_export([
"headers" => $headers,
"body" => $body
]);
wich用更简单的代码为您提供了完全相同的结果:
array (
'headers' =>
array (
0 => 'HTTP/1.1 200 OK',
1 => 'Age: 604109',
2 => 'Cache-Control: max-age=604800',
3 => 'Content-Type: text/html; charset=UTF-8',
4 => 'Date: Fri, 06 Nov 2020 09:50:32 GMT',
5 => 'Etag: "3147526947+ident"',
6 => 'Expires: Fri, 13 Nov 2020 09:50:32 GMT',
7 => 'Last-Modified: Thu, 17 Oct 2019 07:18:26 GMT',
8 => 'Server: ECS (nyb/1D2E)',
9 => 'Vary: Accept-Encoding',
10 => 'X-Cache: HIT',
11 => 'Content-Length: 1256',
),
'body' => '<!doctype html>
<html>
<head>
(capped)
另一个选项是CURLINFO_HEADER_OUT,但是我不建议在PHP中使用CURLINFO_HEADER_OUT,因为它伪造了错误:https://bugs.php.net/bug.php?id=65348
答案 4 :(得分:0)
要分别获取标题和数据,通常使用标题回调和正文回调。就像在这个例子中一样:http://curl.haxx.se/libcurl/php/examples/callbacks.html
要从标题中获取文件名,您需要检查Content-Disposition:标头并从那里提取文件名(如果存在),或者只使用URL中的文件名部分或类似名称。你的选择。