不能在单个阵列上使用levenshtein()来比较和找到距离最短的元素

时间:2014-09-01 16:20:37

标签: php arrays levenshtein-distance

不能用levenshtein()包裹我的头。

假设我有一个看起来像这样的数组:

Array
(
    [0] => Array
        (
            [Gammal URL] => /bil-och-garage
            [Ny URL] => /catalog/verktyg-och-maskiner
        )

    [1] => Array
        (
            [Gammal URL] => /bil-och-garage/12-v-utrustning
            [Ny URL] => /catalog/verktyg-och-maskiner/handverktyg
        )

    [2] => Array
        (
            [Gammal URL] => /bil-och-garage/12-v-utrustning/antenn
            [Ny URL] => /catalog/verktyg-och-maskiner/handverktyg/slag-brytverktyg
        )
)

我想要做的是打印数组但添加了一个。我想要做的是通过数组和每个Gammal URL'做levenshtein()并找到' Ny URL'距离当前' Gammal网址最短的距离'。

如果没有完全匹配(0),请打印最短的匹配。我一直尝试使用foreach的不同用法,嵌套,但无法绕过我的方法,我可以一次一个地检查1个网址的其余部分。

简而言之,我想打印整个数组,但是第三列是最短距离的URL。如果上述不是最佳解决方案,也欢迎使用两个阵列的任何建议。

修改

有了这个,我仍然得到错误的网址"匹配的网址" - 任何想法?

foreach ($import as $key => $arr) {
      $shortest = '';
      foreach ($import as $key2 => $arr2) {
        if ($shortest != '') {
          // If the distance between the current Ny URL is shorted than the previously shortest one    :
          // -> it's the new shortest one, otherwise, I keep the previous one
          $shortest = (levenshtein($arr['Gammal URL'], $arr2['Ny URL']) < levenshtein($arr['Gammal   URL'], $shortest)) ? $arr2['Ny URL'] : $shortest;
        } else { // First attempt is set as the shortest aby default
          $shortest = $arr2['Ny URL'];
        }
      }
      // I found the shortest one for that Gammal URL
      $import[$key]['shortest'] = $shortest;

            echo'<tr>';
            echo'<td>'. $arr['Gammal URL']."</td>";
            echo'<td>'. $arr['Ny URL'].'</td>';
            echo'<td>'. $shortest .'</td>';

    }

完整代码

    <?php

//debug

ini_set('display_errors', 'On');
error_reporting(E_ALL);
ini_set('auto_detect_line_endings', TRUE);
ini_set('max_execution_time', 300);
?>


<?php

//import

function csv_import($filename='', $delimiter=';')
{
  if(!file_exists($filename) || !is_readable($filename))
    return FALSE;

  $header = NULL;
  $data = array();
  if (($handle = fopen($filename, 'r')) !== FALSE)
  {
    while (($row = fgetcsv($handle, 1000, $delimiter)) !== FALSE)
    {
      if(!$header)
        $header = $row;
      else
        $data[] = array_combine($header, $row);
    }
    fclose($handle);
  }
  return $data;
}

$import = csv_import('urler.csv');

//output

//print_r($import);

echo '<table>';
echo '<thead>';
echo '<tr>';
echo "<th>Gammal URL</th>";
echo "<th>Ny URL</th>";
echo "<th>Match URL</th>";
echo "</tr>";
echo "</thead>";
echo "</tbody>";

foreach ($import as $key => $arr) {
  $shortest = '';

  foreach ($import as $key2 => $arr2) {
    if ($shortest != '') {
      // If the distance between the current Ny URL is shorted than the previously shortest one :
      // -> it's the new shortest one, otherwise, I keep the previous one
      $shortest = (levenshtein($arr['Gammal URL'], $arr2['Ny URL']) < levenshtein($arr['Gammal URL'], $shortest)) ? $arr2['Ny URL'] : $shortest;
    } else { // First attempt is set as the shortest aby default
      $shortest = $arr2['Ny URL'];
    }
  }
  // I found the shortest one for that Gammal URL
  $import[$key]['shortest'] = $shortest;


        echo'<tr>';
        echo'<td>'. $arr['Gammal URL']."</td>";
        echo'<td>'. $arr['Ny URL'].'</td>';
        echo'<td>'. $shortest .'</td>';

  }

echo "</tbody>";
echo "</table>";

?>

1 个答案:

答案 0 :(得分:0)

这应该做的工作:

foreach ($array as $key => $arr) {
  $shortest = '';
  foreach ($array as $key2 => $arr2) {
    if ($shortest != '') {
      // If the distance between the current Ny URL is shorted than the previously shortest one :
      // -> it's the new shortest one, otherwise, I keep the previous one
      $shortest = (levenshtein($arr['Gammal URL'], $arr2['Ny URL']) < levenshtein($arr['Gammal URL'], $shortest)) ? $arr2['Ny URL'] : $shortest;
    } else { // First attempt is set as the shortest by default
      $shortest = $arr2['Ny URL'];
    }
  }
  // I found the shortest one for that Gammal URL
  $array[$key]['shortest'] = $shortest;
}

我猜你不得不嵌套foreach,但是如果你需要更多的解释,请评论一下