如果我尝试阅读网站的来源,我有时会得到以下内容(显示示例网址):
Warning: file_get_contents(http://www.iwantoneofthose.com/gift-novelty/golf-ball-finding-glasses/10602617.html)
[function.file-get-contents]: failed to open stream: HTTP request failed!
HTTP/1.1 500 Internal Server Error in /home/public_html/pages/scrape.html on line 165
然而,URL本身就很好..为什么会发生这种情况?
我尝试了以下解决方法建议但结果相同:
$opts = array('http'=>array('header' => "User-Agent:MyAgent/1.0\r\n"));
$context = stream_context_create($opts);
$header = file_get_contents('https://www.example.com',false,$context);
这令我感到困惑......
答案 0 :(得分:2)
问题出在您的User-Agent标头中。这对我有用:
$opts = array('http'=>array('header' => "User-Agent:Mozilla/5.0 (Windows NT 6.2) AppleWebKit/537.1 (KHTML, like Gecko) Chrome/21.0.1180.75 Safari/537.1\r\n"));
$context = stream_context_create($opts);
$header = file_get_contents('http://www.iwantoneofthose.com/gift-novelty/golf-ball-finding-glasses/10602617.html',false,$context);
答案 1 :(得分:2)
我不知道确切的原因,但在使用某些服务器时,file_get_contents
失败了。但你有另一种选择;
$fp = fsockopen("www.iwantoneofthose.com", 80, $errn, $errs);
$out = "GET /gift-novelty/golf-ball-finding-glasses/10602617.html HTTP/1.1\r\n";
$out .= "Host: www.iwantoneofthose.com\r\n";
$out .= "User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:15.0) Gecko/20100101 Firefox/15.0\r\n";
$out .= "Connection: close\r\n";
$out .= "\r\n";
fwrite($fp, $out);
$response = "";
while ($line = fread($fp, 4096)) {
$response .= $line;
}
fclose($fp);
$response_body = substr($response, strpos($response, "\r\n\r\n") + 4);
// or
list($response_headers, $response_body) = explode("\r\n\r\n", $response, 2);
print $response_body;