试图实现一个简单的编辑距离模块

时间:2016-09-09 09:40:51

标签: python

这是该函数的代码:

def populateConfusionMatrix(word,errword):
    dp = [[0]*(len(errword)+1) for i in range(len(word)+1)]
    m = len(word)+1;
    n = len(errword)+1;
    for i in range(m):
        for j in range(n):
            dp[i][0] = i;
            dp[0][j] = j;
    for i in range(m):
        for j in range(n): 
            print(i,j)
            if i==0 or j==0 :
                continue

            dis = [0]*4
            dis[0] = dp[i-1][j]+1
            dis[1] = dp[i][j-1]+1
            print("dis[1] is ",dp[i][j-1]+1)
            if word[i-1] == errword[j-1]:
                dis[2] = dp[i-1][j-1]
            else :
                dis[2] = dp[i-1][j-1]+1

            if i>1 and j>1 and word[i] == errword[j-1] and word[i-1] == errword[j]:
                dis[3] = dp[i-2][j-2] + 1 
            if dis[3]!=0 :
                dp[i][j] = min(dp[0:4])
            else :
                dp[i][j] = min(dp[0:3])

    i = m
    j = n
    while(i>=0 and j>=0) :
        if word[i-1] == errword[j-1] :
            i=i-1
            j=j-1
            continue
        if dp[i][j] == dp[i][j-1]+1 :
            populate_ins(word[i],errword[j])
            j=j-1
        if dp[i][j] == dp[i-1][j]+1 :
            populate_del(errword[j],word[i])
            i=i-1
        if dp[i][j] == dp[i-1][j-1] + 1 :
            populate_sub(word[i],errword[j])
            i=i-1
            j=j-1
        if i>1 and j>1 and word[i] == errword[j-1] and word[i-1] == errword[j] and dp[i][j] == dp[i-2][j-2]+1 :
            populate_exc(word[i-1],word[i])
            i=i-1
            j=j-1

但是这段代码在调用函数时显示了这个错误:

populateConfusionMatrix("actress","acress")

错误 -

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-25-d5ba10f95b61> in <module>()
----> 1 populateConfusionMatrix("actress","acress")

<ipython-input-24-e996de70e204> in populateConfusionMatrix(word, errword)
     15             dis = [0]*4
     16             dis[0] = dp[i-1][j]+1
---> 17             dis[1] = dp[i][j-1]+1
     18             print("dis[1] is ",dp[i][j-1]+1)
     19             if word[i-1] == errword[j-1]:

TypeError: can only concatenate list (not "int") to list

试图打印直到(i,j)循环工作正常的值,我得到了这个 -

(0, 0)
(0, 1)
(0, 2)
(0, 3)
(0, 4)
(0, 5)
(0, 6)
(1, 0)
(1, 1)
('dis[1] is ', 2)
(1, 2)

1 个答案:

答案 0 :(得分:0)

你的代码真的很难理解,但问题肯定在这些方面:

if dis[3]!=0 :
    dp[i][j] = min(dp[0:4])
else :
    dp[i][j] = min(dp[0:3])

由于您的列表dp具有值:

[[0, 1, 2, 3, 4, 5, 6], [1, 0, 0, 0, 0, 0, 0], [2, 0, 0, 0, 0, 0, 0], [3, 0, 0, 0, 0, 0, 0], [4, 0, 0, 0, 0, 0, 0], [5, 0, 0, 0, 0, 0, 0], [6, 0, 0, 0, 0, 0, 0], [7, 0, 0, 0, 0, 0, 0]]

当您使用dp[i][j] = min(dp[0:3])时,您在min的片段上调用了dp,或者:

min([0, 1, 2, 3, 4, 5, 6], [1, 0, 0, 0, 0, 0, 0], [2, 0, 0, 0, 0, 0, 0])

这就是为什么您在尝试向列表添加号码时遇到错误的原因:

dis[1] = dp[i][j-1] + 1  # evaluates to something like [0,0,0,0] + 1