这是请求网址的curl函数。
function get_result( $nodes )
{
$node_count = count($nodes);
$curl_arr = array();
$master = curl_multi_init();
for($i = 0; $i < $node_count; $i++)
{
$url = $nodes[$i];
$curl_arr[$i] = curl_init($url);
curl_setopt($curl_arr[$i], CURLOPT_RETURNTRANSFER, true);
curl_setopt($curl_arr[$i], CURLOPT_CONNECTTIMEOUT, 180);
curl_setopt($curl_arr[$i], CURLOPT_TIMEOUT, 180);
curl_setopt($curl_arr[$i], CURLOPT_ENCODING, "gzip");
curl_setopt($curl_arr[$i], CURLOPT_PROXY, '127.0.0.1:8888');
curl_setopt($curl_arr[$i], CURLOPT_VERBOSE, true);
curl_setopt($curl_arr[$i], CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 6.1; rv:2.2) Gecko/20110201');
curl_setopt($curl_arr[$i], CURLOPT_FOLLOWLOCATION, true);
curl_setopt($curl_arr[$i], CURLOPT_IPRESOLVE, CURL_IPRESOLVE_V4);
curl_multi_add_handle($master, $curl_arr[$i]);
}
do {
curl_multi_exec($master,$running);
curl_multi_select($master, 5.0);
} while($running > 0);
$output = "";
for($i = 0; $i < $node_count; $i++)
{
$output .= curl_multi_getcontent( $curl_arr[$i] );
}
return $output;
}
$offset = 0;
function select_data()
{
global $conn_to_sql;
global $offset;
$select_statement = $conn_to_sql->prepare("SELECT url FROM url_list LIMIT 5 OFFSET $offset");
$select_statement->setFetchMode(PDO::FETCH_ASSOC);
$offset += 5;
$select_statement->execute();
return $select_statement->fetchAll();
}
while( select_data() )
{
$datas = select_data();
foreach ( $datas as $data )
{
$dat = $data["url"];
$nodes[] = $dat;
}
get_result( $nodes )
}
从具有get_result
的循环调用 array of 5 URLs
。 (网址是从包含LIMIT 5
和OFFSET INCREASES BY 5
的表格加载的,但每次请求数量都会增加5。
首次get_result
请求5个网址。
下次请求10个网址(Next 10 URLs without duplication)
,然后是15个网址(Next 15 URLs without duplication)
,这会持续20,25,30,35 ......
我如何知道请求正在增加?所有流量都进入代理(FIDDLER
);
get_result每次只应请求5个网址,但这不会发生。怎么解决这个?
答案 0 :(得分:0)
您的查询后执行。因此,您的查询是逐个发送1个url而不是5乘5 ...
答案 1 :(得分:0)
看起来$nodes
数组收集了太多结果。
while( select_data() )
{
$datas = select_data();
$nodes=array();/*reset $nodes array here*/
foreach ( $datas as $data )
{
$dat = $data["url"];
$nodes[] = $dat;
}
get_result( $nodes )
}