从给定的HTTPS网址获取HTML内容

时间:2015-01-27 21:51:59

标签: php parsing curl file-get-contents

我必须从HTTPS网址下载HTML内容以解析一些链接。

我可以使用以下方式从非HTTPS网址中毫无问题地执行此操作:

的file_get_contents

我尝试使用此代码:

    $ch = curl_init('http://kickass.so/best-new-restaurant-s01e01-italian-cuisine-hdtv-x264-daview-t10113796.html');
curl_setopt_array($ch, array(
    CURLOPT_SSL_VERIFYPEER => true,
    CURLOPT_SSL_VERIFYHOST => 2,
    CURLOPT_VERBOSE => true,
    CURLOPT_CAINFO => 'I:/dev/ServerPHP/movieGather/UniServerZ/core/apache2/server_certs/server.crt',
));

if (false === curl_exec($ch)) {
    echo "Error while loading page: ", curl_error($ch), "\n";
}

但它不起作用。有什么建议吗?

3 个答案:

答案 0 :(得分:1)

试试这个,

$url = 'https://www.example.com/abc';

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);

curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);

// Blindly accept the certificate
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);

// decode response
curl_setopt($ch, CURLOPT_ENCODING, true);
$response = curl_exec($ch);
curl_close($ch);

var_dump($response);

在此处查看更多选项

http://php.net/manual/en/function.curl-setopt.php

答案 1 :(得分:0)

如果您知道并信任,来源和来源将始终是同一个网站,那么请不要通过严格的SSL验证。

这是来自另一个SO答案,PHP's cURL: How to connect over HTTPS?

$url = 'https://www.example.com/abc';

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);

curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);

// Blindly accept the certificate
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);

$response = curl_exec($ch);
curl_close($ch);

var_dump($response);

答案 2 :(得分:0)

function nget($url)
{
    $curl = curl_init();
    curl_setopt($curl, CURLOPT_URL, $url);
    curl_setopt($curl, CURLOPT_REFERER, $url);

    curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, FALSE);
    curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, FALSE);
    curl_setopt($curl, CURLOPT_POST, FALSE);

    curl_setopt($curl, CURLOPT_HEADER, TRUE);
    curl_setopt($curl, CURLOPT_FOLLOWLOCATION, TRUE);
    curl_setopt($curl, CURLOPT_RETURNTRANSFER, TRUE);
    curl_setopt($curl, CURLOPT_AUTOREFERER, TRUE);
    curl_setopt($curl, CURLOPT_FAILONERROR, TRUE);
    curl_setopt($curl, CURLOPT_ENCODING, TRUE);

    curl_setopt($curl, CURLOPT_COOKIEJAR, 'cookie.txt');
    curl_setopt($curl, CURLOPT_COOKIEFILE, 'cookie.txt');

    curl_setopt($curl, CURLOPT_HTTPHEADER, ['text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9']);

    curl_setopt($curl, CURLOPT_USERAGENT, 'Mozilla/5.0 (Macintosh; Intel Mac OS X 11_1_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36');

    $content = curl_exec($curl);
    curl_close($curl);
    return $content;
}

$url = 'https://example.com';
$m = nget($url);
var_dump($m);