对于我在java中的大学任务,我被要求提供额外的分析功能"我决定使用Levenshtein距离但我有一个问题,输出到控制台的数字比实际答案少一个。所以" cat"之间的距离和"帽子"应为1,但显示为0
public class Levenshtein {
public Levenshtein(String first, String second) {
char [] s = first.toCharArray();
char [] t = second .toCharArray();
int Subcost = 0;
int[][] array = new int[first.length()][second.length()];
for (int i = 0; i < array[0].length; i++)
{
array[0][i] = i;
}
for (int j = 0; j < array.length; j++)
{
array [j][0]= j;
}
for (int i = 1; i < second.length(); i++)
{
for (int j = 1; j < first.length(); j++)
{
if (s[j] == t [i])
{
Subcost = 0;
}
else
{
Subcost = 1;
}
array [j][i] = Math.min(array [j-1][i] +1,
Math.min(array [j][i-1] +1,
array [j-1][i-1] + Subcost) );
}
}
UI.output("The Levenshtein distance is -> " + array[first.length()-1][second.length()-1]);
}
}
答案 0 :(得分:0)
显然您正在使用以下算法:
https://en.wikipedia.org/wiki/Levenshtein_distance#Iterative_with_full_matrix
我认为你对指数不太准确。我不确定问题究竟在哪里,但这里是working version:
public int calculateLevenshteinDistance(String first, String second) {
char[] s = first.toCharArray();
char[] t = second.toCharArray();
int substitutionCost = 0;
int m = first.length();
int n = second.length();
int[][] array = new int[m + 1][n + 1];
for (int i = 1; i <= m; i++) {
array[i][0] = i;
}
for (int j = 1; j <= n; j++) {
array[0][j] = j;
}
for (int j = 1; j <= n; j++) {
for (int i = 1; i <= m; i++) {
if (s[i - 1] == t[j - 1]) {
substitutionCost = 0;
} else {
substitutionCost = 1;
}
int deletion = array[i - 1][j] + 1;
int insertion = array[i][j - 1] + 1;
int substitution = array[i - 1][j - 1] + substitutionCost;
int cost = Math.min(
deletion,
Math.min(
insertion,
substitution));
array[i][j] = cost;
}
}
return array[m][n];
}