我找到了一种使用get_headers($url)
从网址获取HTTP响应代码的方法。此函数返回如下所示的数组...
Array
(
[0] => HTTP/1.1 200 OK
[1] => Date: Sat, 29 May 2004 12:28:13 GMT
[2] => Server: Apache/1.3.27 (Unix) (Red-Hat/Linux)
[3] => Last-Modified: Wed, 08 Jan 2003 23:11:55 GMT
[4] => ETag: "3f80f-1b6-3e1cb03b"
[5] => Accept-Ranges: bytes
[6] => Content-Length: 438
[7] => Connection: close
[8] => Content-Type: text/html
)
我的问题是我可能有一个很大的URL列表,我想循环并获取每个URL的HTTP响应代码。对于潜在的100个URL,在循环中使用此函数似乎是一种令人讨厌且缓慢的方式。
如何加快这一过程并使其更清洁,或者这是最好的方法吗?我很想知道你的建议。
由于
答案 0 :(得分:0)
网络调用本身需要时间,但您可以通过并行运行这些调用来更快地完成。一种方法是使用curl_multi。给我一点时间,我会写一个例子。
//set up list of urls and arrays to hold responses
$urls = array(
'http://www.livestrong.com/',
'http://www.apple.com/'
//add more urls here
);
$response_map = array();
$responses_by_url = array();
//create the multi object
$multi = curl_multi_init();
foreach($urls as $url) {
//add a request for each url
$ch = curl_init($url);
$response_map[$ch] = $url;
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HEADER, true);
//since that's all you need, we'll save some bandwidth by just asking for the HEAD
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'HEAD');
curl_multi_add_handle($multi, $ch);
}
//start the multi request
$still_running = 0;
curl_multi_exec($multi, $still_running);
//loop while waiting for completion
do {
curl_multi_select($multi); //blocks until state change
curl_multi_exec($multi, $still_running); //get new state
//read all available new information
while ($info = curl_multi_info_read($multi)) {
if ($info['msg'] === CURLMSG_DONE) {
//we're done, check the result
if ($info['result'] === CURLE_OK) {
//result ok, parse it
$url = $response_map[$info['handle']];
$header_text = curl_multi_getcontent($info['handle']);
curl_multi_remove_handle($multi, $info['handle']);
$header_array = explode("\r\n", trim($header_text));
$responses_by_url[$url] = $header_array;
} else {
//record error
$responses_by_url[$url] = "error: " . curl_error($ch);
}
}
}
} while ($still_running);
//clean up
curl_multi_close($multi);
//output results
var_dump($responses_by_url);
答案 1 :(得分:0)
您需要使用curl_multi_init()来更快地执行这100个请求。有一个小的php lib php-multi-curl可以帮助你完成任务。