如何强制curl(with)PHP将页面下载为浏览器?我想下载的页面是价格比较器,例如http://www.ceneo.pl/22416171。它是公开的,任何人都可以访问网站。
要检查是否可以进行卷曲下载,我在基于Debian的本地服务器上输入
curl http://www.ceneo.pl/22416171
它完美无缺。但我确实需要在我的虚拟PHP-Apache服务器上使用它,所以我需要使用PHP来实现它。
尝试下载基于PHP的curl页面时,它没有给我任何东西,与shell curl相反。 为什么?如何在PHP上获得正确的内容?
尝试:
<?php
$curl = curl_init(http://www.ceneo.pl/22416171);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl, CURLOPT_HEADER, 1);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
curl_setopt($curl,CURLOPT_HTTPHEADER,
array(
'User-Agent: Mozilla/5.0 (Windows NT 6.0; rv:28.0) Gecko/20100101 Firefox/28.0',
'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Language: pl,en-US;q=0.7,en;q=0.3',
'Accept-Encoding: gzip, deflate',
'p3p: CP="NOI CURa ADMa DEVa TAIa OUR BUS IND UNI COM NAV INT"',
'Vary: Accept-Encoding',
'Content-Type: text/html; charset=utf-8',
'Cache-Control: private'
));
$body = curl_exec($curl);
curl_close($curl);
echo $body;
?>
我也尝试使用
<?php exec(curl http://www.ceneo.pl/22416171); ?>
但它给了
curl: /usr/local/lib/libcurl.so.4: no version information available (required by curl)
答案 0 :(得分:1)
查看文档:{{3}}
您就是这样做的:
<强> test.php的强>
<?php
// create curl resource
$ch = curl_init();
// set url
curl_setopt($ch, CURLOPT_URL, "http://www.ceneo.pl/22416171");
//return the transfer as a string
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
//set headers
curl_setopt($ch,CURLOPT_HTTPHEADER, array(
'User-Agent: Mozilla/5.0 (Windows NT 6.0; rv:28.0) Gecko/20100101 Firefox/28.0',
'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Language: pl,en-US;q=0.7,en;q=0.3',
//'Accept-Encoding: gzip, deflate',
'p3p: CP="NOI CURa ADMa DEVa TAIa OUR BUS IND UNI COM NAV INT"',
'Vary: Accept-Encoding',
'Content-Type: text/html; charset=utf-8',
'Cache-Control: private'
));
// $output contains the output string
$output = curl_exec($ch);
// close curl resource to free up system resources
curl_close($ch);
// debug
echo $output;
演示工作(仅检索网站的html输出):