正如我们所说,我们可以通过levenshtein
找到最接近的单词,例如:
<?php
$subj = "hello world";
$str = array();
$str[] = "hallo";
$str[] = "helo";
$minStr = "";
$minDis = PHP_INT_MAX;
foreach ($str as $curStr) {
$dis = levenshtein($subj, $curStr);
if ($dis < $minDis) {
$minDis = $dis;
$minStr = $curStr;
}
}
echo($minStr);
输出为hallo
,但我想从不正确的单词中找到最接近的正确单词,例如在hallo
和helo
之间找到你好$subj
中的正确单词,从例如字典和输出中返回。hallo
和helo
键入最终用户,hello
在服务器上保存为正确的单词
我该怎么做?
答案 0 :(得分:0)
我想我理解你的问题。
在这里,我爆炸了主题,并将主题和str字嵌套 levenhstein的回归被置于一个数组中,首先是主题词,然后是&#34;距离&#34;,然后是一个子数组,其中所有单词都与主题词的距离。
$subj = "hello world";
$subj = explode(" ", "hello world");
$str = ["hallo", "helo", "aaahelojjjj", "pizza", "Manhattan"];
$minStr = "";
$minDis = PHP_INT_MAX;
foreach ($str as $curStr) {
Foreach($subj as $word){
$dis = levenshtein($word, $curStr);
$dist[$word][$dis][] = $curStr;
}
}
// optional sort keys in subarrays
foreach($dist as &$arr){
ksort($arr);
}
unset($arr);
Var_export($dist);
输出:
(unsorted)
array (
'hello' => //word
array (
1 => // $key is levenhstein output (distance from word)
array ( // values are the words that is $key distance from word
0 => 'hallo', //both these words are one from the word 'hello'
1 => 'helo',
),
8 =>
array ( // these words are 8 from 'hello'
0 => 'aaahelojjjj',
1 => 'Manhattan',
),
5 =>
array (
0 => 'pizza',
),
),
'world' => // here is how far each word is from 'world'
array (
4 =>
array (
0 => 'hallo', // both hallo and helo is 4 characters from 'world'
1 => 'helo',
),
10 =>
array (
0 => 'aaahelojjjj',
),
5 =>
array (
0 => 'pizza',
),
9 =>
array (
0 => 'Manhattan',
),
),
)