我想创建一个简单的搜索引擎,在用户输入中查找关键字。我知道我可以使用strpos来检查字符串中是否存在单词。但是,我希望用户能够拼写错误的单词。例如,
$userInput = "What year did George Washingtin become president?";
$key_word = "Washington";
someFuntion($userInput, $key_word, $percent);
if($percent > .95){
$user_searched_washington = True;
}
是否有任何PHP功能可以执行此操作,或者您是否有关于如何创建功能的建议?
答案 0 :(得分:3)
您可以尝试利用PHP标准库中的levenshtein功能。请参阅此处以获取文档中的一些示例:http://php.net/manual/en/function.levenshtein.php
但是,当您的可能关键字列表增长时,这可能会成为非常昂贵的计算。
编辑:最不可行的例子:
<?php
$myInput = 'persident';
$possibleKeywords = ['tyrant', 'president', 'king', 'royal'];
$scores = [];
foreach ($possibleKeywords as $keyword) {
$scores[] = levenshtein($myInput, $keyword);
}
echo $possibleKeywords[array_search(min($scores), $scores)];
// prints: "president"
答案 1 :(得分:2)
以下是我根据您的标题(同时使用strpos
和similar_text
)提出的内容,这应该足以让您入门。这允许除短语之外的单个单词搜索并忽略标点符号:
function search($haystack, $needle) {
// remove punctuation
$haystack = preg_replace('/[^a-zA-Z 0-9]+/', '', $haystack);
// look for exact match
if (stripos($haystack, $needle)) {
return true;
}
// look for similar match
$words = explode(' ', $haystack);
$total_words = count($words);
$total_search_words = count(explode(' ', $needle));
for ($i = 0; $i < $total_words; $i++) {
// make sure the number of words we're searching for
// don't exceed the number of words remaining
if (($total_words - $i) < $total_search_words) {
break;
}
// compare x-number of words at a time
$temp = implode(' ', array_slice($words, $i, $total_search_words));
$percent = 0;
similar_text($needle, $temp, $percent);
if ($percent >= 80) {
return true;
}
}
return false;
}
$text = "What year did George Washingtin become president?";
$keyword = "Washington";
if (search($text, $keyword)) {
echo 'looks like a match!';
}