我有一个PHP脚本,应该连接到代理,从代理列表中选择并下载文件。一些代理(200-400个工作代理)工作得很好,但其他代理没有,我找不到原因。
这里是通过代理连接的代码:
$proxy = determine_proxy ($proxyList);
$proxyString = 'tcp://' . $proxy['ip'] . ':' . $proxy['port'];
$userAgent = $userAgents [rand (0, $agentsCount - 1)];
// set up our headers
$hdrs = array( 'http' => array(
'method' => "GET",
'header'=> "Host: www.example.net\r\n" .
// "User-Agent: $userAgent\r\n" .
"Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\r\n" .
"Accept-Language: en-us,en;q=0.5\r\n" .
"Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7\r\n" .
"Keep-Alive: 115\r\n" .
"Proxy-Connection: keep-alive\r\n" .
"Referer: http://$url", // Setting the http-referer
'proxy' => "$proxyString",
'request_fulluri' => true
)
);
echo "Using proxy: "; print_r ($proxy); echo '<br>';
$context = stream_context_create ($hdrs); // set up the context
$timeout = 3;
$oldTimeout = ini_set('default_socket_timeout', $timeout);
$oldAgent = ini_set ('user_agent', $userAgent);
$fp = fopen ("http://www.example.net$file", 'r', false, $context); // open the file
if (!$fp) {
echo 'fopen failed! Skipping this proxy for now...<br>';
print_r ($http_response_header); echo '<br />';
unset ($http_response_header);
flush(); @ob_flush();
ini_set ('user_agent', $oldAgent);
ini_set('default_socket_timeout', $oldTimeout);
continue;
}
print_r ($http_response_header); echo '<br />';
unset ($http_response_header);
奇怪的是,失败尝试的响应标题有时是空的,有时它是以下内容:
Array (
[0] => HTTP/1.0 200 OK
[1] => Server: falcon
[2] => Date: Sun, 16 Jan 2011 14:06:37 GMT
[3] => Content-Type: application/x-bittorrent
[4] => Cache-Control: must-revalidate, post-check=0, pre-check=0
[5] => Content-Disposition: attachment; filename="example.torrent"
[6] => Vary: Accept-Encoding,User-Agent
[7] => Connection: close
)
有时,就是这样:
Array (
[0] => HTTP/1.0 200 OK
[1] => Server: falcon
[2] => Date: Sun, 16 Jan 2011 14:06:47 GMT
[3] => Content-Type: application/x-bittorrent
[4] => Cache-Control: must-revalidate, post-check=0, pre-check=0
[5] => Content-Disposition: attachment; filename="example2.torrent"
[6] => Vary: Accept-Encoding,User-Agent
[7] => X-Cache: MISS from proxy
[8] => Proxy-Connection: close
)
这是来自 成功 尝试的回复标题:
HTTP/1.0 200 OK
Server: falcon
Date: Fri, 21 Jan 2011 18:53:00 GMT
Content-Type: application/x-bittorrent
Cache-Control: must-revalidate, post-check=0, pre-check=0
Content-Disposition: attachment; filename="example3.torrent"
Vary: Accept-Encoding,User-Agent
X-Cache: MISS from www.example.com
X-Cache-Lookup: MISS from www.example.com:3128
Via: 1.0 www.example.com (squid/3.0.STABLE23-BZR)
Proxy-Connection: close
我将用户代理设置为有效的用户代理字符串,我检查了allow_url_fopen并将其设置为On。
来自RFC-2616,第10节:
200 OK
请求已成功。该 响应返回的信息 取决于使用的方法 请求,例如:
获取与之对应的实体 请求的资源在。中发送 响应;
如何通过代理服务器返回状态200,仍然fopen失败?有没有人知道这个问题以及如何解决它?
答案 0 :(得分:2)
问题在于,我设置的套接字超时在某些情况下对于fopen管理和下载所有数据来说太低了。在超时期限结束后,fopen仍然没有下载数据,它返回FALSE并抛出“HTTP reqeust failed”错误。
答案 1 :(得分:0)
服务器报告200 OK,但代理仍然不知道将数据转发到哪里,所以你得到了请求失败...
尝试使用VIA标头