我有一个用于URL状态检查工具的php脚本,该工具将检查给定的网址并显示404错误的网址。
StatusCheckerRequest的输入带有“\ n”分隔的URL
public function PostStatusChecker(StatusCheckerRequest $request){
$urls = $request->source;
$seperateURLs = explode("\n", $urls);
// -- create all the individual cURL handles and set their options
$curl_handles = array();
foreach ($seperateURLs as $url) {
$curl_handles[$url] = curl_init();
curl_setopt($curl_handles[$url], CURLOPT_URL, $url);
curl_setopt($curl_handles[$url], CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl_handles[$url], CURLOPT_CONNECTTIMEOUT, 20);
curl_setopt($curl_handles[$url], CURLOPT_SSL_VERIFYPEER, false);
}
// -- start going through the cURL handles and running them
$curl_multi_handle = curl_multi_init();
$i = 0; // count where we are in the list so we can break up the runs into smaller blocks
$block = array(); // to accumulate the curl_handles for each group we'll run simultaneously
$results = array();
$curlErrors = array();
foreach ($curl_handles as $a_curl_handle) {
$i++; // increment the position-counter
// add the handle to the curl_multi_handle and to our tracking "block"
curl_multi_add_handle($curl_multi_handle, $a_curl_handle);
$block[] = $a_curl_handle;
// -- check to see if we've got a "full block" to run or if we're at the end of out list of handles
if (($i % BLOCK_SIZE == 0) or ($i == count($curl_handles))) {
// -- run the block
$running = NULL;
do {
// track the previous loop's number of handles still running so we can tell if it changes
$running_before = $running;
// run the block or check on the running block and get the number of sites still running in $running
curl_multi_exec($curl_multi_handle, $running);
print_r (curl_multi_info_read($curl_multi_handle));
} while ($running > 0);
// -- once the number still running is 0, curl_multi_ is done, so check the results
foreach ($block as $handle) {
// HTTP response code
$code = curl_getinfo($handle, CURLINFO_HTTP_CODE);
$results['httpCode'][] = $code;
// cURL error number
$curl_errno = curl_errno($handle);
$results['curlErrorNo'][] = $curl_errno;
// cURL error message
$curl_error = curl_error($handle);
$results['curlErrorMessage'][] = $curl_error;
// remove the (used) handle from the curl_multi_handle
curl_multi_remove_handle($curl_multi_handle, $handle);
}
// reset the block to empty, since we've run its curl_handles
$block = array();
}
}
// close the curl_multi_handle once we're done
curl_multi_close($curl_multi_handle);
print_r($results);
die();
}
我使用了Stack Overflow中的curl_multi_exec示例,当我使用这些URL检查结果时:
Array
(
[0] => stackoverfloww.com
[1] => www.laravel2.com
[2] => http://stackoverflow.com
[3] => http://laravel.com
)
输出
[httpCode] => Array
(
[0] => 0
[1] => 0
[2] => 0
[3] => 301
)
[curlErrorMessage] => Array
(
[0] => Illegal characters found in URL
[1] => Illegal characters found in URL
[2] => Illegal characters found in URL
[3] =>
)
我尝试了不同的输入,结果总是最后一个URL返回200或301,其他都是0.我还检查curl_multi_info_read的结果,结果全部为3“找到非法字符”网址,最后一个的值是0.
你能帮忙解决这个问题吗? 非常感谢你。
答案 0 :(得分:2)
快速搜索cURL源代码会发现此错误来自于提供给CURLOPT_URL
的网址包含字符\r
和/或\n
。
来自lib/url.c
:
/* We might pass the entire URL into the request so we need to make sure
* there are no bad characters in there.*/
if(strpbrk(data->change.url, "\r\n")) {
failf(data, "Illegal characters found in URL");
return CURLE_URL_MALFORMAT;
}
您应该通过$url = trim($url);
运行网址,因为网址末尾可能还有剩余的\r
或\n
。