我正在使用cURL打开HTTPS页面。我请求的页面发出重定向请求。我已将cURL设置为遵循重定向,但我似乎无法让它请求正确的页面。我在浏览器中跟踪了相同的请求,我看到我的浏览器对cURL做出了不同的请求。我该怎么做才能纠正这个问题?正确的URL显示在详细cURL转储的输出中。它遵循“*向此URL发出另一个请求”
以下是cURL详细输出的输出片段:
< HTTP/1.1 302 Moved Temporarily
< Location: /XXX
< Content-Type: text/html; charset=UTF-8
< Date: Tue, 31 Dec 2013 15:51:46 GMT
< Expires: Tue, 31 Dec 2013 15:51:46 GMT
< Cache-Control: private, max-age=0
< X-Content-Type-Options: nosniff
< X-Frame-Options: SAMEORIGIN
< X-XSS-Protection: 1; mode=block
< Server: GSE
< Alternate-Protocol: 443:quic
< Transfer-Encoding: chunked
<
* Ignoring the response-body
* Connection #0 to host 127.0.0.1 left intact
* Issue another request to this URL: 'XYYYZ'
* Re-using existing connection! (#0) with host 127.0.0.1
* Connected to 127.0.0.1 (127.0.0.1) port 8888 (#0)
> GET /??? HTTP/1.1
User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:25.0) Gecko/20100101 Firefox/25.0
我使用的PHP代码如下:
$ch = curl_init();
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 6.1; rv:25.0) Gecko/20100101 Firefox/25.0');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($ch, CURLOPT_COOKIEJAR, COOKIE_FILE);
curl_setopt($ch, CURLOPT_COOKIEFILE, COOKIE_FILE);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_VERBOSE, true);
curl_setopt($ch, CURLOPT_PROXY, '127.0.0.1:8888');
$target = ADDR;
curl_setopt($ch, CURLOPT_URL, $target);
$page = curl_exec($ch);
答案 0 :(得分:0)
cURL遵循Location:Header,但请确保使用CURLOPT_HTTPHEADER选项发送确切的标题(content-language,referer)浏览器,因为某些服务器拒绝连接以阻止自动请求。在Firefox中,您可以使用实时http标头来查看浏览器的功能。
还要确保Location:标头包含绝对URL,而不是根据http 1.1的相对路径。
如果这不起作用,您可以使用curL_info选项CURLOPT_HEADER来捕获302并手动重定向。
这里我发布一个示例来手动执行,以便检查是否会产生无限循环。
$ch = curl_init();
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 6.1; rv:25.0) Gecko/20100101 Firefox/25.0');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($ch, CURLOPT_COOKIEJAR, COOKIE_FILE);
curl_setopt($ch, CURLOPT_COOKIEFILE, COOKIE_FILE);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_VERBOSE, true);
curl_setopt($ch, CURLOPT_PROXY, '127.0.0.1:8888');
$target = ADDR;
curl_setopt($ch, CURLOPT_URL, $target);
$page = curl_exec($ch);
$curl_info = curl_getinfo($ch);
if ($curl_info['http_code'] == 302 || $curl_info['http_code'] == 301)
{
$response_headers = substr($page, 0, $curl_info['header_size']);
if (preg_match('#Location: (.*)#', $response_headers, $location_header))
{
// Call again curl to follow location; Better to wrap the curl process in a function called follow_location
// echo $location_header return an Array
// echo $location_header[0] return "Location: http//blablabla"
// echo $location_header[1] return URL only "http://blablbalba.com" and you can process with cURL :D
echo $location_header[1];
}
}