近似模式匹配?

时间:2016-10-12 13:17:50

标签: python

我正在尝试为近似模式匹配编写代码,如下所示:

def HammingDistance(p, q):
    d = 0
    for p, q in zip(p, q): # your code here
        if p!= q:
            d += 1
    return d
Pattern = "ATTCTGGA"
Text = "CGCCCGAATCCAGAACGCATTCCCATATTTCGGGACCACTGGCCTCCACGGTACGGACGTCAATCAAAT"
d = 3
def ApproximatePatternMatching(Pattern, Text, d):
    positions = [] # initializing list of positions
    for i in range(len(Text) - len(Pattern)+1):
        if Pattern == Text[i:i+len(Pattern)]:
            positions.append(i)# your code here
    return positions
print (ApproximatePatternMatching(Pattern, Text, d))

我一直收到以下错误: 测试#3失败。您可能无法考虑从第一个文本索引开始的模式。

测试数据集:

GAGCGCTGG
GAGCGCTGGGTTAACTCGCTACTTCCCGACGAGCGCTGTGGCGCAAATTGGCGATGAAACTGCAGAGAGAACTGGTCATCCAACTGAATTCTCCCCGCTATCGCATTTTGATGCGCGCCGCGTCGATT
2

你的输出:

['[]', '0']

正确输出:

['0', '30', '66']

无法弄清楚我做错了什么,因为我正在尝试学习python,所以对编程没有任何想法。需要帮助?

2 个答案:

答案 0 :(得分:1)

我不确定为什么你得到一个空列表作为你的输出之一 - 当我运行你的代码时,我只得到[0]作为打印输出。

具体来说,您的代码目前仅检查精确的字符子串匹配,而不使用您也包含的汉明距离定义。

以下内容应返回您期望的结果:

Pattern = "GAGCGCTGG"
Text = "GAGCGCTGGGTTAACTCGCTACTTCCCGACGAGCGCTGTGGCGCAAATTGGCGATGAAACTGCAGAGAGAACTGGTCATCCAACTGAATTCTCCCCGCTATCGCATTTTGATGCGCGCCGCGTCGATT"
d = 3

def HammingDistance(p, q):
    d = 0
    for p, q in zip(p, q): # your code here
        if p!= q:
            d += 1
    return d

def ApproximatePatternMatching(Pattern, Text, d):
    positions = [] # initializing list of positions
    for i in range(len(Text) - len(Pattern)+1):
        # and using distance < d, rather than exact matching
        if HammingDistance(Pattern, Text[i:i+len(Pattern)]) < d:
            positions.append(i)
    return positions

print (ApproximatePatternMatching(Pattern, Text, d))

答案 1 :(得分:0)

def ApproximatePatternMatching(Pattern, Text, d):
    positions = [] 


    for i in range(len(Text)-len(Pattern)+1):
        x = Text[i:i+len(Pattern)+1]
        if x != Pattern:
            y = HammingDistance(Pattern,x)
            if y <= d:
                positions.append(i)
    return positions    




def HammingDistance(p, q):


   count = 0

   for i in range(len(p)):
       x = p[i]
       y = q[i]
       if x != y:
           count = count + 1
   return count