Question

我想仅使用curl加载head的内容，目前使用

<?php 

$url="www.facebook.com";

$title='';$keywords='';$description='';
    $ch = curl_init();
$timeout=5;
    curl_setopt($ch, CURLOPT_HEADER, 0);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_URL, 'http://'.$url);
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
    curl_setopt ($ch, CURLOPT_TIMEOUT,0);

    $html = curl_exec($ch);
    curl_close($ch);
echo htmlspecialchars($html);//gives the complete source.Why?

//parsing begins here:
$doc = new DOMDocument();
@$doc->loadHTML($html);

$nodes = $doc->getElementsByTagName('title');
$metas = $doc->getElementsByTagName('meta');

if($nodes->length>0)$title = $nodes->item(0)->nodeValue;

for ($i = 0; $i < $metas->length; $i++)
{
    $meta = $metas->item($i);
    if($meta->getAttribute('name') == 'description')
        $description = $meta->getAttribute('content');
    if($meta->getAttribute('name') == 'keywords')
        $keywords = $meta->getAttribute('content');
}
echo $title. '<br/>';
echo "&nbsp;&nbsp;&nbsp;&nbsp;$description". '<br/>';
echo "&nbsp;&nbsp;&nbsp;&nbsp;$keywords";
?>

此代码返回完整的url代码，但我只想要head.Don't与之前的问题没关系，因为这里没有必要使用curlopt_writefunction（）

Answer 1

CURLOPT_HEADER应为TRUE，而不是0

CURLOPT_NOBODY应为TRUE

curl_setopt($ch, CURLOPT_NOBODY, TRUE);

Answer 2

尽管名称相似，但HEADER与html <head>不对应，BODY也与html <body>不对应。 CURLOPT_HEADER表示在返回值中包含http标头。 CURLOPT_NOBODY表示不在返回值中包含http有效负载（带内容类型的http响应的有效负载：text / html将是整个html文档）。

使用curl加载头部内容

2 个答案: