PHP get_headers()报告的标头不同于CURL

时间:2012-08-31 07:33:43

标签: php curl http-headers

get_headers()怎么可能返回与通过CURL获取它们不同的结果?这是我的代码:

header("Content-type: text/plain");
$url = 'http://www.foxbusiness.com/index.html';

echo "get_headers() headers:\n\n";
$headers = get_headers($url);
print_r($headers);

echo "\n\nCURL headers\n\n";
$curl = curl_init();
curl_setopt_array( $curl, array(
    CURLOPT_HEADER => true,
    CURLOPT_NOBODY => true,
    CURLOPT_RETURNTRANSFER => true,
    CURLOPT_URL => $url ) );
$headers = explode( "\n", curl_exec( $curl ) );
curl_close( $curl );
print_r($headers);

结果如下:

get_headers() headers:

Array
(
    [0] => HTTP/1.0 403 Forbidden
    [1] => Server: AkamaiGHost
    [2] => Mime-Version: 1.0
    [3] => Content-Type: text/html
    [4] => Content-Length: 283
    [5] => Expires: Fri, 31 Aug 2012 07:29:14 GMT
    [6] => Date: Fri, 31 Aug 2012 07:29:14 GMT
    [7] => Connection: close
)


CURL headers

Array
(
    [0] => HTTP/1.1 200 OK
    [1] => Server: Apache
    [2] => X-FoxNews-EdgeTTL: 2m
    [3] => Content-Type: text/html;charset=UTF-8
    [4] => Cache-Control: max-age=64
    [5] => Date: Fri, 31 Aug 2012 07:29:14 GMT
    [6] => Connection: keep-alive
    [7] => 
    [8] => 
)

2 个答案:

答案 0 :(得分:6)

默认情况下,

get_headers会在您配置cURL以执行HEAD请求时执行GET请求。首先,通过添加不同的HTTP stream context using HEAD for the request method.

,使请求与cURL发送的内容相同

此外,服务器似乎需要用户代理,因此请确保provide user_agent in php.ini或将其添加到流上下文。

以下内容应该有效:

stream_context_set_default(
    array(
        'http' => array(
            'method' => 'HEAD',
            'user_agent' => "PHP"
        )
    )
);

请参阅http://codepad.viper-7.com/cOO9XS

请注意stream_context_set_default修改全局默认流上下文,因此在调用上述内容后,使用此流包装器对其他方法的任何调用现在都会执行HEAD请求。与例如file_get_contents不同,get_headers不允许通过函数的参数提供自定义流上下文。换句话说,确保在获得标题后将方法更改回GET。

答案 1 :(得分:4)

在get_headers之前添加不同的User-Agent标头:

stream_context_set_default(
    array(
        'http' => array(
            'method' => 'HEAD',
            'header' => "User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_4) AppleWebKit/537.1 (KHTML, like Gecko) Chrome/21.0.1180.79 Safari/537.1\r\n"
        )
    )
);

而且,不妨指定HEAD,因为你只需要标题。通过此更改,您可以获得正确的标题。

<强>输出

get_headers() headers:

Array
(
    [0] => HTTP/1.0 200 OK
    [1] => Server: Apache
    [2] => X-FoxNews-EdgeTTL: 2m
    [3] => Content-Type: text/html;charset=UTF-8
    [4] => Cache-Control: max-age=76
    [5] => Date: Fri, 31 Aug 2012 07:53:24 GMT
    [6] => Connection: close
)


CURL headers

Array
(
    [0] => HTTP/1.1 200 OK
    [1] => Server: Apache
    [2] => X-FoxNews-EdgeTTL: 2m
    [3] => Content-Type: text/html;charset=UTF-8
    [4] => Cache-Control: max-age=76
    [5] => Date: Fri, 31 Aug 2012 07:53:24 GMT
    [6] => Connection: keep-alive
    [7] => 
    [8] => 
)