具有多个关键字和排序结果的PHP搜索数组

时间:2017-09-22 20:30:52

标签: php

我有计划从我准备的txt文件中搜索,txt文件内容类似如下所示

A.TXT

Amy Jefferson
Nathalie Johnson
Emma West
Donna Jefferson
Tanya Nathalie
George West
Emma Watson
Emma Jefferson

如果代码是这样的

a.php只会

$filename = "a.txt";
$example = file($filename, FILE_IGNORE_NEW_LINES);
$searchword = 'Emma Jefferson';
$matches = array();
foreach($example as $k=>$v) {
    if(preg_match("/\b$searchword\b/i", $v)) {
        $matches[$k] = $v;
        echo $matches[$k]."<br>";
    }
}

结果只会是“艾玛杰斐逊”

然后,如果我使用此代码

b.php

$filename = "a.txt";
$example = file($filename, FILE_IGNORE_NEW_LINES);
$searchword = 'Emma Jefferson';
$matches = array();
foreach($example as $k=>$v) {
    $searchword2 = str_ireplace(" ", "|", $searchword);
    if(preg_match("/\b$searchword2\b/i", $v)) {
        $matches[$k] = $v;
        echo $matches[$k]."<br>";
    }
}

结果将是这样的

Amy Jefferson
Emma West
Donna Jefferson
Emma Watson
Emma Jefferson

独特的结果,但最后的结果是“艾玛杰斐逊”

所以问题是我如何搜索Emma Jefferson,结果排序就像这样

Emma Jefferson
Emma Watson
Emma West
Amy Jefferson
Donna Jefferson

所以基本上它首先搜索“Emma Jefferson”的全部单词,然后是“Emma”,最后一个是“Jefferson”

更新 我投票支持Do not Panic这个问题的代码,但我想说谢谢你们所有的贡献者不要恐慌,RomanPerekhrest,Sui Dream,Jere,i-man,你们所有人都是最棒的!

Pattygeek

5 个答案:

答案 0 :(得分:1)

我不知道如何通过正则表达式解决方案考虑匹配位置,但如果将搜索字符串和术语转换为单词数组,则可以完成。

通过这种方法,我们迭代文本项并为搜索词中的每个单词构建一个位置匹配数组,然后按匹配数,然后匹配位置对结果进行排序。

$search_words = explode(' ', strtolower($searchword));

foreach ($example as $item) {
    $item_words = explode(' ', strtolower($item));

    // look for each word in the search term
    foreach ($search_words as $i => $word) {
        if (in_array($word, $item_words)) {

            // add the index of the word in the search term to the result
            // this way, words appearing earlier in the search term get higher priority
            $result[$item][] = $i;
        }
    }
}

// this will sort alphabetically if the uasort callback returns 0 (equal)
ksort($result);

// sort by number of matches, then position of matches    
uasort($result, function($a, $b) {
    return count($b) - count($a) ?: $a <=> $b;
});

// convert keys to values    
$result = array_keys($result);

答案 1 :(得分:0)

您当前立即回复结果,因此他们按照文本排序。

您可以搜索完整的字符串和部分匹配,然后搜索concatenate results

foreach($example as $k=>$v) {
    if(preg_match("/\b$searchword\b/i", $v)) {
        $fullMatches[] = $v;
    }
    if(preg_match("/\b$searchword2\b/i", $v)) {
        $matches[] = $v;
    }
}
$matches = array_unique(array_merge($fullMatches, $matches));
foreach($matches as $k => $v)
    echo $v . "<br>";

<强>更新

多个单词变体:

$words = ['Emma', 'Jefferson'];
$matches = array();
foreach($example as $k => $v) {
    $fullStr = implode(' ', $words);
    if(preg_match("/\b$fullStr\b/i", $v))
        $matches[0][] = $v;
    $str = "";
    $i = 1;
    foreach($words as $word) {
        if ($str === "")
            $str = $word;
        else
            $str .= '|' . $word;
        if(preg_match("/\b$str\b/i", $v))
            $matches[$i][] = $v;
        $i++;
    }
}
$result = array();
foreach($matches as $firstKey => $arr) {
    foreach($arr as $secondKey => $v) {
        $result[] = $v;
    }
}
$result = array_unique($result);
foreach($result as $k => $v)
    echo $v . "<br>";

答案 2 :(得分:0)

复杂的解决方案:

$lines = file('a.txt', FILE_IGNORE_NEW_LINES);
$name = 'Emma';
$surname = 'Jefferson';
$emmas = $jeffersons = [];

foreach ($lines as $l) {
    if (strpos($l, $name) === 0) {
        $emmas[] = $l;
    } elseif ( strrpos($l, $surname) === (strlen($l) - strlen($surname)) ) {
        $jeffersons[] = $l;
    }
}

usort($emmas, function($a,$b){
    return strcmp(explode(' ', $a)[1], explode(' ', $b)[1]);
});
usort($jeffersons, function($a,$b){
    return strcmp($a, $b);
});

$result = array_merge($emmas, $jeffersons);
print_r($result);

输出:

Array
(
    [0] => Emma Jefferson
    [1] => Emma Watson
    [2] => Emma West
    [3] => Amy Jefferson
    [4] => Donna Jefferson
)

答案 3 :(得分:0)

你必须编写一个新的循环或者开始对你的数组进行排序,因为foreach循环当时只需要一个元素名称,测试它是否与你的搜索词匹配,如果匹配你的搜索词,那么名称就在你的新数组$matches[]。所以

    if(preg_match("/\b$searchword2\b/i", $v)) {
    $matches[$k] = $v;
    echo $matches[$k]."<br>";
}

部分对$matches[]内已经或不存在的名称一无所知。

所以我的建议是:

$filename = "a.txt";
$example = file($filename, FILE_IGNORE_NEW_LINES);
$searchword = 'Emma Jefferson';
$matches = array();



$searchword2 = array($searchword, explode(" ", $searchword)[0], explode(" ", $searchword)[1]);
$isThisNameAlreadyInTheList;

foreach($searchword2 as $actualSearchword) {

    foreach($example as $k=>$v) {

        $isThisNameAlreadyInTheList = false;
        foreach($matches as $match) {   
            if(preg_match("/\b$match\b/i", $v)) {
                $isThisNameAlreadyInTheList = true;
            }
        }

        if (!$isThisNameAlreadyInTheList) {
            if(preg_match("/\b$actualSearchword\b/i", $v)) {
                $matches[$k] = $v;
                echo $matches[$k]."<br>";
            }
        }
    }

}

答案 4 :(得分:0)

我会像这样使用preg_match_all解决方案:

$searchName = "Emma Jefferson";
$searchTerms = explode(' ', $searchName);

$pattern = "/(\b$searchTerms[0]\b \b$searchTerms[1]\b)|(\b$searchTerms[0]\b \w+)|(\w* \b$searchTerms[1]\b)/i";

$output = [];
preg_match_all($pattern, implode(' | ', $example), $out);

foreach($out as $k => $o){
    if($k == 0){
        continue;
    }

    foreach($o as $item){
        if(!empty($item)){
            $output[] = $item;
        }
    }
}

print_r($output);

您也可以将文件作为字符串输入,并避免内容部分。