我正在使用以下脚本来抓取数据,当我访问php页面时,它仅适用于Google之类的少数网站,而对于其余网站,它将只是一个空白页面。代码有问题吗?以及如何调试它?
<?php
$request = curl_init("https://www.google.com");
curl_setopt($request, CURLOPT_RETURNTRANSFER, true);
curl_setopt($request, CURLOPT_HTTPHEADER, array(
'Content-type: application/json',
'Authorization: Bearer 31d15a'
));
$response = curl_exec($request);
echo $response;
curl_close($request);
答案 0 :(得分:0)
还有更多选项可以设置,但是下面的内容可能足以满足您的任务。
//Initialise Curl
$ch = curl_init();
//set the url to be used
curl_setopt($ch, CURLOPT_URL, $url);
//follow HTTP 3xx redirects
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
//automatically update the referer header
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
//accept the responce after the execution
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
//don't verify the peer's SSL certificate
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
//set the browser
curl_setopt($ch, CURLOPT_USERAGENT,'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13');
//executes given cURL session.
$html = curl_exec($ch);
//disable libxml errors
libxml_use_internal_errors(TRUE);
//closes Curl session, & frees up the associated memory
curl_close($ch);