我试图让我的Levenshtein编辑距离算法工作但由于某种原因,编辑的数量不正确。我无法看出我的错误在哪里,我想知道是否有人看到我做错了什么。
输入
id
预期产出
5
ATCGTT
AGTTAC
ACGAAT
CCGTAAAT
TTACGACCAGT
我的输出
Strand A: ATCGTT--
Strand B: A--GTTAC
Edit Distance: 4
Strand A: ATCG-TT
Strand B: A-CGAAT
Edit Distance: 3
Strand A: ATCGT---T
Strand B: -CCGTAAAT
Edit Distance: 5
Strand A: AT-CG----TT
Strand B: TTACGACCAGT
Edit Distance: 7
Strand A: AGTTAC
Strand B: ACGAAT
Edit Distance: 4
Strand A: -AGT-TAC
Strand B: CCGTAAAT
Edit Distance: 5
Strand A: --A-G-TTA-C
Strand B: TTACGACCAGT
Edit Distance: 8
Strand A: ACG--AAT
Strand B: CCGTAAAT
Edit Distance: 3
Strand A: --ACGA--A-T
Strand B: TTACGACCAGT
Edit Distance: 5
Strand A: --CCG-TAAAT
Strand B: TTACGACCAGT
Edit Distance: 7
findEditDistance
Strand A: ATCGT-
Strand B: AGTTAC
Edit Distance: 5
Strand A: ATC-T-
Strand B: ACGAAT
Edit Distance: 5
Strand A: ATC-T-
Strand B: CCGTAAAT
Edit Distance: 5
Strand A: A-C-T-
Strand B: TTACGACCAGT
Edit Distance: 10
Strand A: AGTTAC
Strand B: ACGAAT
Edit Distance: 5
Strand A: AG-TAC
Strand B: CCGTAAAT
Edit Distance: 6
Strand A: A--T-C
Strand B: TTACGACCAGT
Edit Distance: 7
Strand A: AC-AAT
Strand B: CCGTAAAT
Edit Distance: 7
Strand A: AC---T
Strand B: TTACGACCAGT
Edit Distance: 8
Strand A: CC-TAAAT
Strand B: TTACGACCAGT
Edit Distance: 8
getDistance的
void EditDistance::findEditDistance()
{
int upperValue, leftValue, diagonalValue;
for (int i = 0; i < mLengthX; ++i)
{
table[i][0].stringLength = i;
}
for (int i = 0; i < mLengthY; ++i)
{
table[0][i].stringLength = i;
}
for (int i = 1; i < mLengthX; ++i)
{
for (int j = 1; j < mLengthY; ++j)
{
if (mStringX[i] == mStringY[j])
{
table[i][j].direction = DIAGONAL;
table[i][j].stringLength = table[i - 1][j -1].stringLength;
}
else
{
upperValue = table[i - 1][j].stringLength;
leftValue = table[i][j - 1].stringLength;
diagonalValue = table[i - 1][j - 1].stringLength;
if (upperValue < leftValue)
{
if (upperValue < diagonalValue)
{
//upper is the lowest
table[i][j].stringLength = table[i - 1][j].stringLength + 1;
table[i][j].direction = UP;
}
else
{
//diagonal is lowest
table[i][j].stringLength = table[i - 1][j -1].stringLength + 1;
table[i][j].direction = DIAGONAL;
}
}
else if (leftValue < diagonalValue)
{
//left is lowest
table[i][j].stringLength = table[i][j - 1].stringLength + 1;
table[i][j].direction = LEFT;
}
else
{
//diagonal is lowest
table[i][j].stringLength = table[i - 1][j -1].stringLength + 1;
table[i][j].direction = DIAGONAL;
}
}
}
}
}
updateStrands
void EditDistance::getDistance()
{
int i = mStringX.length() - 1;
int j = mStringY.length() - 1;
numEdits = 0;
updateStrands (i, j);
}
答案 0 :(得分:1)
编辑距离的问题出在updateStrands
。它将对角线移动计为1,而实际上对角移动的距离可以是1(替换)或0(匹配)。您可以在updateStrands
中修复此问题,但当数字已经位于table
的右下角时,根本不需要在那里进行计算。
如果你想要正确的&#34; strands&#34; (例如&#34; ATCGTT - &#34;和#34; A - GTTAC&#34;),您必须在updateStrands
(您更改元素)中进行更正你应该插入)的字符串,getDistance
(你从错误的地方开始)和findEditDistance
(你忽略了在上面为direction
赋值将stringLength
设置为i
时的左边缘。