Question

我正在尝试使用PHP下载网页的内容。当我发出命令时：

$f = file_get_contents("http://mobile.mybustracker.co.uk/mobile.php?searchMode=2");

它返回一个报告服务器已关闭的页面。然而，当我将相同的URL粘贴到我的浏览器中时，我得到了预期的页面。

有谁知道造成这种情况的原因是什么？ file_get_contents是否会传输任何标题以区别于浏览器请求？

Answer 1

是的，存在差异 - 浏览器倾向于发送大量额外的HTTP headers，我会说;并且两者发送的可能没有相同的值。

在这里，经过几次测试后，似乎需要传递名为Accept的HTTP标头。

这可以使用file_get_contents的第三个参数来完成，以指定附加上下文信息：

$opts = array('http' =>
    array(
        'method'  => 'GET',
        //'user_agent '  => "Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2) Gecko/20100301 Ubuntu/9.10 (karmic) Firefox/3.6",
        'header' => array(
            'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*\/*;q=0.8
'
        ), 
    )
);
$context  = stream_context_create($opts);

$f = file_get_contents("http://mobile.mybustracker.co.uk/mobile.php?searchMode=2", false, $context);
echo $f;

有了这个，我就可以获得该页面的HTML代码。

注意：

我首先测试了传递User-Agent，但似乎没有必要 - 这就是为什么相应的行在这里作为注释
Accept标头使用的值是我在尝试file_get_contents之前使用Firefox请求该页面时使用的Firefox。
- 其他一些值可能没问题，但我没有做任何测试来确定哪个值是必需的值。

有关更多信息，您可以查看：

file_get_contents
stream_context_create
Context options and parameters
HTTP context options - 这是有趣的页面，在这里; - ）

Answer 2

用％20替换所有空格

PHP file_get_contents（）的行为与浏览器不同

2 个答案: