Question

正如我们所说，我们可以通过levenshtein找到最接近的单词，例如：

<?php
$subj = "hello world";
$str = array();
$str[] = "hallo";
$str[] = "helo";

$minStr = "";
$minDis = PHP_INT_MAX;
foreach ($str as $curStr) {
    $dis = levenshtein($subj, $curStr);
    if ($dis < $minDis) {
        $minDis = $dis;
        $minStr = $curStr;
    }
}
echo($minStr);

输出为hallo，但我想从不正确的单词中找到最接近的正确单词，例如在hallo和helo之间找到你好$subj中的正确单词，从例如字典和输出中返回。hallo和helo键入最终用户，hello在服务器上保存为正确的单词

我该怎么做？

Answer 1

我想我理解你的问题。

在这里，我爆炸了主题，并将主题和str字嵌套 levenhstein的回归被置于一个数组中，首先是主题词，然后是＆＃34;距离＆＃34;，然后是一个子数组，其中所有单词都与主题词的距离。

$subj = "hello world";
$subj = explode(" ", "hello world");

$str = ["hallo", "helo", "aaahelojjjj", "pizza", "Manhattan"];

$minStr = "";
$minDis = PHP_INT_MAX;
foreach ($str as $curStr) {
    Foreach($subj as $word){
        $dis = levenshtein($word, $curStr);   
        $dist[$word][$dis][] = $curStr;
    }
}
// optional sort keys in subarrays 
foreach($dist as &$arr){
    ksort($arr);
}
unset($arr);
Var_export($dist);

输出：

(unsorted)
array (
  'hello' => //word
  array (
    1 =>     // $key is levenhstein output (distance from word)
    array (  // values are the words that is $key distance from word 
      0 => 'hallo', //both these words are one from the word 'hello'
      1 => 'helo',
    ),
    8 => 
    array ( // these words are 8 from 'hello'
      0 => 'aaahelojjjj',
      1 => 'Manhattan',
    ),
    5 => 
    array (
      0 => 'pizza',
    ),
  ),
  'world' =>  // here is how far each word is from 'world'
  array (
    4 =>  
    array (
      0 => 'hallo', // both hallo and helo is 4 characters from 'world'
      1 => 'helo',
    ),
    10 => 
    array (
      0 => 'aaahelojjjj',
    ),
    5 => 
    array (
      0 => 'pizza',
    ),
    9 => 
    array (
      0 => 'Manhattan',
    ),
  ),
)

https://3v4l.org/OVp7J

php找到正确最接近的单词

1 个答案: