我一直在研究搜索功能,它会让我对特定单词进行分组,然后以数组的形式递归搜索它们,我将自己编码成一个洞。这是我第一次涉足递归,我确信我已经把它搞砸了。我很接近,但我似乎无法正确完成它,并希望有人能指出我正确的方向。
我正在与bigrams合作。我从一系列从当前事件中拉出的双字母组开始。
“至少,6”,“6,德州”,“强力球,大奖”,“被杀,德州”,“德州,龙卷风”,“嫌疑人,游行”等等。有50个独特的短语。
我正在尝试做的是找到相关的单词并将它们分组。使用上述数据,将分组的词语是:“最少,6,德克萨斯,被杀,龙卷风”,因为它们都是相关的。
我的方法是。拿起第一个二元组,分开它,搜索所有的双字母组合,然后在那两个单词被发现的地方抓住那些双字母组,拆分它们并再一次重复。这应该让我得到所有的比赛。 (或足以做我需要做的事)。
我已经到了我非常确定我正在获取正确数据但我在从数组中删除这些单词时遇到问题所以不会一遍又一遍地搜索它们。每次迭代搜索都应该得到由于词数较少,因此较小。
好的代码。 (注意递归函数不是我的,那部分实际上是有效的。)同样是的,它很乱。就像我说的那样,只是学习,我打算在事后清理它。
底部附近的array_diff应该删除已经搜索过的单词然后重复处理。提前致谢。
<?php
$mainbigramarray = array();
$explodedarray = array();
$resarr = array();
$i=1;
function recursive_array_search($needle,$haystack,$subloop = false) {
if($subloop === false) $resarr = array();
foreach($haystack as $key=>$value) {
$current_key=$key;
if(is_string($needle)) $needle = trim(strtolower($needle));
if(is_string($value)) $value = trim(strtolower($value));
if($needle===$value OR (is_array($value) && recursive_array_search($needle,$value,true) === true)) {
$resarr[] = $current_key;
if($subloop === true) return true;
}
}
return $resarr;
}
WHILE($rows = mysql_fetch_array($query))
{
unset($bigrams);
$bigram = $rows['bigram'];
$count = $rows['m'];
$rid = $rows['RID'];
$bgexplode = explode(" ", $bigram);
foreach($bgexplode as $bg) {
$bigrams[] = $bg;
}
$workingbigram [] = $bigrams;
array_push($bigrams, $count);
$mainbigramarray[] = $bigrams;
}
echo "<table>";
echo "<tr>";
$firstrun = array();
$secondrun = array();
$resultrun = array();
$matchset1 = array();
$bigramset = array();
echo "<td>";
foreach($workingbigram as $bgg) { //steps through the main array that holds the exploded bigram
unset($firstrun);
foreach($bgg as $word) { //steps through both words of the bigram
$search1 = recursive_array_search($word, $mainbigramarray);
$firstrun[$word] = $search1;
}
$bigramset[] = $firstrun;
}
//echo "<pre>";
//print_r($bigramset);
//echo "</pre>";
echo "</td>";
echo "<td>";
$counter = 0;
foreach($bigramset as $key1=>$value1) { //get the array that holds the exploded biram
foreach($value1 as $key2=>$value2) { //get the array that holds the ids of whre the word is found
//echo "$key2<br>";
foreach($value2 as $searchid) { //gets the id to pull the matching exploded bigrams from.
unset($bigresult);
foreach($workingbigram[$searchid] as $wordresult) { //gets word to seasrch from by iding mainbigram array
$bigresult = recursive_array_search($wordresult, $mainbigramarray);
}
$resultrun[] = $bigresult;
}
}
foreach($resultrun as $key3=>$value3) {
foreach($value3 as $finalsearchid) {
foreach($workingbigram[$finalsearchid] as $lastsearchterm) {
$finalwordset[] = $lastsearchterm;
}
}
}
$finalwordset = array_unique($finalwordset);
foreach($finalwordset as $word) {
$total = recursive_array_search($word, $mainbigramarray);
$totalsum = 0;
foreach($total as $lastlookup) {
unset($bucket);
foreach($mainbigramarray[$lastlookup] as $total6) {
echo "$total6<br>";
$bucket[] = $total6;
}
//echo "Score:" . $bucket[2] . "<br>";
$totalsum = $bucket[2] + $totalsum;
//echo "TOTAL SUM: $totalsum<br>";
}
echo "TOTAL SUM: $totalsum<br>";
}
$bigramset = array_diff($bigramset[$i], $finalwordset);
//if ($counter ==1) break;
$i++;
}
//echo"<pre>";
//print_r($newarray);
//echo "</pre>";
echo "</td>";
echo "</tr>";
echo "</table>";
答案 0 :(得分:0)
这将返回一个数组,其中键为单词,值为相关单词数组。
$bigrams = array("least, 6", "6, texas", "powerball,jackpot", "killed, texas", "texas, tornado", "suspect, parade" );
$results = array();
function searchBigrams($val,$i,$bigrams){
global $results;
if(!isset($results[$val]) || !is_array($results[$val])){ $results[$val] = array(); }
foreach($bigrams as $b){
$nodes = explode(',',$b);
$nodes = array_map('trim',$nodes);
$r = array_search($val,$nodes);
if(($r = array_search($val,$nodes)) !== false){
$new_bigrams = $bigrams;
if(($key = array_search($b, $new_bigrams)) !== false) {
unset($new_bigrams[$key]);
}
if($r == 0){
if(!in_array($nodes[1],$results[$i])){ $results[$i][] = $nodes[1]; }
if(!in_array($nodes[1],$results[$i])){ $results[$val][] = $nodes[1]; }
if(!isset($results[$nodes[1]])){
$results[$nodes[1]] = array();
}
if(!in_array($val,$results[$nodes[1]])){
$results[$nodes[1]][] = $val;
}
searchBigrams($nodes[1],$i,$new_bigrams);
} else {
if(!in_array($nodes[0],$results[$i])){ $results[$i][] = $nodes[0]; }
if(!in_array($nodes[0],$results[$i])){ $results[$val][] = $nodes[0]; }
if(!isset($results[$nodes[0]])){
$results[$nodes[0]] = array();
}
if(!in_array($val,$results[$nodes[0]])){
$results[$nodes[0]][] = $val;
}
searchBigrams($nodes[0],$i,$new_bigrams);
}
}
}
}
foreach($bigrams as $b){
$nodes = explode(',',$b);
$nodes = array_map('trim',$nodes);
foreach($nodes as $n){
searchBigrams($n,$n,$bigrams);
}
}
var_dump($results);