一起使用similar_text和strpos

时间:2016-10-27 17:32:30

标签: php

我想创建一个简单的搜索引擎,在用户输入中查找关键字。我知道我可以使用strpos来检查字符串中是否存在单词。但是,我希望用户能够拼写错误的单词。例如,

$userInput = "What year did George Washingtin become president?";
$key_word = "Washington";
someFuntion($userInput, $key_word, $percent);
if($percent > .95){
$user_searched_washington = True;
}

是否有任何PHP功能可以执行此操作,或者您是否有关于如何创建功能的建议?

2 个答案:

答案 0 :(得分:3)

您可以尝试利用PHP标准库中的levenshtein功能。请参阅此处以获取文档中的一些示例:http://php.net/manual/en/function.levenshtein.php

但是,当您的可能关键字列表增长时,这可能会成为非常昂贵的计算。

编辑:最不可行的例子:

<?php

$myInput = 'persident';
$possibleKeywords = ['tyrant', 'president', 'king', 'royal'];
$scores = [];

foreach ($possibleKeywords as $keyword) {
    $scores[] = levenshtein($myInput, $keyword);
}

echo $possibleKeywords[array_search(min($scores), $scores)];
// prints: "president"

答案 1 :(得分:2)

以下是我根据您的标题(同时使用strpossimilar_text)提出的内容,这应该足以让您入门。这允许除短语之外的单个单词搜索并忽略标点符号:

function search($haystack, $needle) {
    // remove punctuation
    $haystack = preg_replace('/[^a-zA-Z 0-9]+/', '', $haystack);

    // look for exact match
    if (stripos($haystack, $needle)) {
        return true;
    }

    // look for similar match
    $words = explode(' ', $haystack);
    $total_words = count($words);
    $total_search_words = count(explode(' ', $needle));
    for ($i = 0; $i < $total_words; $i++) {
        // make sure the number of words we're searching for
        // don't exceed the number of words remaining
        if (($total_words - $i) < $total_search_words) {
            break;
        }

        // compare x-number of words at a time
        $temp = implode(' ', array_slice($words, $i, $total_search_words));
        $percent = 0;
        similar_text($needle, $temp, $percent);
        if ($percent >= 80) {
            return true;
        }
    }

    return false;
}

$text = "What year did George Washingtin become president?";
$keyword = "Washington";

if (search($text, $keyword)) {
    echo 'looks like a match!';
}