这是实际的功能
$linkArray = array(//associative array for title and url of site and etc
"url"=>"",
"title"=>"",
"num"=>""
);
$site = $_POST['link'];
function parseLink($link){
global $conn;//reinitialize conn variable b/c inside fn it's unknown unless reinitialized
//get title of page
$page_title = get_url_title($link);
//1-Get HTML content
$html = file_get_contents($link);
$a_tag = preg_match_all('/<a[^>]+>/i', $html, $result);
$i = 0;//dummy var
//check if we get result
if($a_tag > 0){
while($i < $a_tag){
$alink = $result[0][$i];
preg_match_all('/(href)=("[^"]*")/i',$alink, $href);
$hreflnk = $href[0][0];
$hreflnk = str_replace('href="',"",$hreflnk);
$hreflnk = substr($hreflnk, 0, -1);
//filter out only links that start with http -> this will automatically include https
//$randomvar = preg_match("/^http/i",$hreflnk,$outp);
if(preg_match("/^http/i",$hreflnk)){
$hreflnk = $hreflnk;
} else{
//idk
$hreflnk = '';
}
$linkArray['url'][$i] = $hreflnk;
$linkArray['title'] = $page_title;
$linkArray['num'] = $a_tag;
$siteTitle = $linkArray['title'];
$numLinks = $linkArray['num'];
$siteUrl = $linkArray['url'][$i];
$queryI = $conn->query("INSERT INTO data (title, url, numLinks) VALUES ('$siteTitle','$siteUrl','$numLinks')");
$outLink = $linkArray['url'][0];
$i++;
}
}
return $outLink;
}
此函数接收URL搜索HTML并将所有http
链接存储在数组中,然后将它们上传到数据库。在它结束之前,它返回所有收集的第一个链接。所以我想要做的是将该链接运行回此函数。然后这个过程将重复。
答案 0 :(得分:3)
只需使用递归函数:
function foo($i = 0) {
if ($i == 100) {
return 'yay';
} else {
$i++;
return foo($i);
}
}
foo();
或者,您真的需要重复功能吗?
$i = 0;
while($i != 100) {
$i++;
}
或:
$i = 0;
do {
$i++;
} while ($i != 100);
使用字符串:
do {
$recordString = DB::get()->getRow('SELECT `name` from `username` ORDER BY RAND()')->getField('name');
} while ($recordString != 'admin') {
//$recordString will now 100% be 'admin', unless you time out during the execution
修改强>:
随着您的新问题更新,我建议您将代码更改为以下内容:
<?php
function parseLink($link, $outLinks = array()){
global $conn;//reinitialize conn variable b/c inside fn it's unknown unless reinitialized
//get title of page
$page_title = get_url_title($link);
//1-Get HTML content
$html = file_get_contents($link);
$document = DOMDocument::loadHTML($html);
foreach ($document->getElementsByTagName('a') as $element) {
$href = $element->getAttribute('href');
if (!isset($outLinks[$href])) { //don't redo ones we've already done
$linkArray['url'][$i] = $href;
$linkArray['title'] = $page_title;
$linkArray['num'] = $i;
$siteTitle = $linkArray['title'];
$numLinks = $linkArray['num'];
$siteUrl = $linkArray['url'][$i];
$queryI = $conn->query("INSERT INTO data (title, url, numLinks) VALUES ('$siteTitle','$siteUrl','$numLinks')");
$outLinks[$href] = $linkArray['url'][0];
$i++;
}
}
foreach ($outLinks as $href => $dbID) {
$outsLinks = parseLink($href, $outLinks);
}
return $outLinks;
}
?>
<强>更改强>:
DOMDocument
代替,因为正则表达式不应该用于解析HTML。$outLink
以使用它返回的数组$outLinks
。答案 1 :(得分:0)
while($myvar) $myvar = test($myvar);
答案 2 :(得分:0)
请试试这个:
$value = "Initial value";
function test($input){
// does stuff inside
// sets output to false if blank
return $output;
}
while($value) {
// runs until test returns false or an empty string
$value = test($value);
}