全局比对序列功能

时间:2018-12-14 17:42:30

标签: python dynamic-programming bioinformatics sequence-alignment

我正在尝试实现the Needleman-Wunsch algorithm以获得全局比对功能中的最低分,但是当两个序列相等时,我得到8而不是最低分0。

此代码有什么问题?

alphabet = ["A", "C", "G", "T"] 
score = [[0, 4, 2, 4, 8], \
     [4, 0, 4, 2, 8], \
     [2, 4, 0, 4, 8], \
     [4, 2, 4, 0, 8], \
     [8, 8, 8, 8, 8]]

def globalAlignment(x, y):
#Dynamic version very fast
    D = []
    for i in range(len(x)+1):
        D.append([0]* (len(y)+1))

    for i in range(1, len(x)+1):
        D[i][0] = D[i-1][0] + score[alphabet.index(x[i-1])][-1]
    for i in range(len(y)+1):
        D[0][i] = D[0][i-1]+ score[-1][alphabet.index(y[i-1])]

    for i in range(1, len(x)+1):
        for j in range(1, len(y)+1):
            distHor = D[i][j-1]+ score[-1][alphabet.index(y[j-1])]
            distVer = D[i-1][j]+ score[-1][alphabet.index(x[i-1])]
            if x[i-1] == y[j-1]:
                distDiag = D[i-1][j-1]
            else:
                distDiag = D[i-1][j-1] + score[alphabet.index(x[i-1])][alphabet.index(y[j-1])]

            D[i][j] = min(distHor, distVer, distDiag)

    return D[-1][-1]

x = "ACGTGATGCTAGCAT"
y = "ACGTGATGCTAGCAT"
print(globalAlignment(x, y))

3 个答案:

答案 0 :(得分:1)

进行更改-> for i in range(len(y)+1): 更改为 for i in range(1, len(y) + 1): 和-> distVer = D[i-1][j]+ score[-1][alphabet.index(x[i-1])]

distVer = D[i - 1][j] + score[alphabet.index(x[i - 1])][-1]

答案 1 :(得分:0)

至少

distHor = D[i][j-1]+ score[-1][alphabet.index(y[j-1])]
distVer = D[i-1][j]+ score[-1][alphabet.index(x[i-1])]

是可疑的,因为在初始化中您没有为[-1]使用相同的位置, 而且两个距离在重量中不可能使用相同的方向...
我想应该是

score[alphabet.index(x[i-1])][-1]

但这可能不是唯一的错误...

答案 2 :(得分:0)

我通过在分数的最后一个列表中放置0而不是8来解决此问题;

alphabet = ["A", "C", "G", "T"] 
score = [[0, 4, 2, 4, 8], \
     [4, 0, 4, 2, 8], \
     [2, 4, 0, 4, 8], \
     [4, 2, 4, 0, 8], \
     [0, 0, 0, 0, 0]]

def globalAlignment(x, y):
#Dynamic version very fast
D = []
for i in range(len(x)+1):
    D.append([0]* (len(y)+1))

for i in range(1, len(x)+1):
    D[i][0] = D[i-1][0] + score[alphabet.index(x[i-1])][-1]
for i in range(len(y)+1):
    D[0][i] = D[0][i-1]+ score[-1][alphabet.index(y[i-1])]

for i in range(1, len(x)+1):
    for j in range(1, len(y)+1):
        distHor = D[i][j-1]+ score[-1][alphabet.index(y[j-1])]
        distVer = D[i-1][j]+ score[alphabet.index(x[i-1])][-1]
        if x[i-1] == y[j-1]:
            distDiag = D[i-1][j-1]
        else:
            distDiag = D[i-1][j-1] + score[alphabet.index(x[i-1])][alphabet.index(y[j-1])]

        D[i][j] = min(distHor, distVer, distDiag)

return D[-1][-1]