使用带有curl的简单HTML DOM解析器用于gzip url

时间:2016-10-16 11:57:00

标签: php curl

我希望使用 file_get_html()功能获取并处理网页。所以我尝试用curl函数最佳地完成它,如下所示:

function file_get_html_new($url, $use_include_path = false, $context=null)
{
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_HTTPHEADER, $context );
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 0);
    curl_setopt($ch, CURLOPT_TIMEOUT, 5);
    $contents = curl_exec($ch);
    curl_close($ch);
    if (empty($contents) || strlen($contents) > MAX_FILE_SIZE)
    {
        return false;
    }
    $dom->load($contents, $lowercase, $stripRN);
    return $dom;
}
$html = file_get_html_new( 'http://***.us/'. $imdb_id , false , array('Host: ***.us',
'Connection: keep-alive',
'Cache-Control: max-age=0',
'Upgrade-Insecure-Requests: 1',
'User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.143 Safari/537.36',
'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
'Accept-Encoding: gzip, deflate, sdch',
'Accept-Language: en-US,en;q=0.8,fa;q=0.6',
'Cookie: ***',
'AlexaToolbar-ALX_NS_PH: AlexaToolbar/alx-4.0'));

但是我遇到了运行代码的以下错误:

PHP Fatal error:  Call to a member function load() on a non-object

通过这样获得curl函数的结果:

function file_get_html_new($url, $use_include_path = false, $context=null)
{
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_HTTPHEADER, $context );
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 0);
    curl_setopt($ch, CURLOPT_TIMEOUT, 5);
    $contents = curl_exec($ch);
    curl_close($ch);
    if (empty($contents) || strlen($contents) > MAX_FILE_SIZE)
    {
        return false;
    }
    //$dom->load($contents, $lowercase, $stripRN);
    echo $contents;
}

显示了一些含糊不清的结果,我发现它发生的原因是内容是“gzip”ed。我像这样解压缩它们:

$contents = gzinflate( substr(curl_exec($ch),10,-8) );

并试一试:

$contents = gzdecode (curl_exec($ch));

现在我有正确的内容,但错误仍然存​​在!你能帮我理解为什么吗?

0 个答案:

没有答案