我正在尝试为Facebook群组构建一个抓取工具,而且我在运行此工具时获得了页面内容:
<?php
$url = "https://www.facebook.com/groups/theGroupId/";
$ch = curl_init($url); // initialize the CURL library in my PHP script so we can later work on it - inside the handler.
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); // curl_setopt() function is used to set options on the $ch handler.// in this case we use the CURLOPT_RETURNTRANSFER option
$curl_scraped_page = curl_exec($ch); // "run all the stuff we've set" - return the data scraped to the variable $curl_scraped_page
var_dump($curl_scraped_page);
if ($curl_scraped_page === false) {
die(curl_error($ch));
}
curl_close($ch);
echo $curl_scraped_page;
?>
我收到此错误:“SSL证书问题:无法获取本地颁发者证书”。
我在本教程中运行:http://unitstep.net/blog/2009/05/05/using-curl-in-php-to-access-https-ssltls-protected-sites/解释了为什么会发生这种情况,以及如何通过两种不同的方式解决它,我尝试了两种方法但仍然得到相同的错误消息:
<?php
$url = "https://www.facebook.com/groups/{theGroupId}/";
$ch = curl_init($url); // initialize the CURL library in my PHP script so we can later work on it - inside the handler.
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); // curl_setopt() function is used to set options on the $ch handler.// in this case we use the CURLOPT_RETURNTRANSFER option
//curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 2);
curl_setopt($ch, CURLOPT_CAINFO, getcwd() . "/CAcerts/GTECyberTrustGlobalRoot.crt");
$curl_scraped_page = curl_exec($ch); // "run all the stuff we've set" - return the data scraped to the variable $curl_scraped_page
var_dump($curl_scraped_page);
if ($curl_scraped_page === false) {
die(curl_error($ch));
}
curl_close($ch);
echo $curl_scraped_page;
?>
这是确切的输出(使用var_dump):
boolean false
SSL certificate problem: unable to get local issuer certificate
我做错了吗?这是否是正确的方法呢?
答案 0 :(得分:1)
<?php
$url = "http://www.facebook.com/groups/4189052132/";
function curl($url) {
$options = Array(
CURLOPT_RETURNTRANSFER => TRUE,
CURLOPT_FOLLOWLOCATION => TRUE,
CURLOPT_AUTOREFERER => TRUE,
CURLOPT_CONNECTTIMEOUT => 120,
CURLOPT_TIMEOUT => 120,
CURLOPT_MAXREDIRS => 10,
CURLOPT_USERAGENT => "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1a2pre) Gecko/2008073000 Shredder/3.0a2pre ThunderBrowse/3.2.1.8",
CURLOPT_URL => $url,
CURLOPT_COOKIE => $session
);
$ch = curl_init();
curl_setopt_array($ch, $options);
$data = curl_exec($ch);
curl_close($ch);
return $data;
}
$scraped_page = curl($url);
echo $scraped_page;
?>
无需验证其证书。这就是你遇到问题的原因。