Python(spyder)跳过函数中的代码行吗?

时间:2019-07-19 12:49:24

标签: python bioinformatics dna-sequence

我正在做一个生物信息学课程的项目。 对于该项目,我们收到多个DNA字符串和一个整数k。该项目的任务是找到一个K-mer基序,以使基序和每个DNA字符串之间的汉明距离之和最小。

我为此任务编写了一个函数MedianString(seqLines,k)。在该函数中,首先将初始值分配给几个变量,然后包括一个for循环。当我调用该函数时,python似乎跳过了循环之前的所有行,并直接运行循环中的内容。我试图在网上搜索问题的可能原因,并找到了一些类似的讨论,但是似乎没有一个适合我的情况。我很迷路...

python跳过的行:

print('BBBBBBBBBBBBBB')
distance = 100000000000000000000000000000000
print('distance=')
print(distance)
Median = []
pattern = []
sumHD = 0

运行的行:

for i in range (0, 4**k-1):
    pattern = NumberToPattern(i,k)
    print('this is pattern')
    print (pattern)
    sumHD = sumOfMinHD(pattern,seqLines)
    print('sumHD=',type(sumHD))
    print(sumHD)
    print('distance=',type(distance))
    print(distance)
    if distance > sumHD:
        print('distance>sumOfMinHD')
        distance = sumOfMinHD(pattern,seqLines)
        Median = pattern
        print('distance=')
        print(distance)
    else:
        print('distance is <=sumOfMinHD')

完整代码(伪代码:加利福尼亚大学圣地亚哥分校的DNA中的隐藏消息查找(生物信息学I)):

#Code Challenge: Implement MedianString.
#Input: An integer k, followed by a collection of strings Dna.
#Output: A k-mer Pattern that minimizes d(Pattern, Dna) among all possible choices of k-mers. 
#(If there are multiple such strings Pattern, then you may return any one.)
 #the concept of the code:
 #MedianString(Dna, k)
 #   distance ← ∞
 #   for each k-mer Pattern from AA…AA to TT…TT
 #       if distance > d(Pattern, Dna)
 #            distance ← d(Pattern, Dna)
 #            Median ← Pattern
 #   return Median

with open(r'D:\Users\moonc\Desktop\python_exercises_for_bioimformatics_I\MedianStringSampleInput.txt','r') as seqFile :
    DataSet = seqFile.read().splitlines()
    print ('this is Dataset')
    print (DataSet)
    seqLines = DataSet [1:]
    print ('this is seqLines')
    print (seqLines)
    print (len(seqLines))
    k = int(DataSet[0])
    print ('this is k')
    print (k)

#NumberToPattern
def NumberToPattern(number, k):
    pattern = []
    for i in range (0,k):
        if number // (4**(k-1-i)) == 0:
            pattern.append ("A")
        elif number // (4**(k-1-i)) == 1:
            pattern.append ("C")
        elif number // (4**(k-1-i)) == 2:
            pattern.append ("G")
        elif number // (4**(k-1-i)) == 3:
            pattern.append ("T")

    number = number % (4**(k-1-i))

    intToString = map(str, pattern)  
    patternString = "".join(intToString)

    return patternString

#Hamming Distance Problem: Compute the Hamming distance between two strings.
#   Input: Two strings of equal length.
#   Output: The Hamming distance between these strings.

def HammingDistance(p,q):
    HD = 0
    for i in range(0,len(p)):
        if p[i] != q [i]:
           HD = HD+1

    return HD


#minHDandMotif(Pattern, Text) is the minimum Hamming distance between Pattern and any k-mer in Text

def minHDandMotif(Pattern,String):
    HD = float('inf')
    motif = []
    for i in range (0,len(String)-len(Pattern)+1):
        if HammingDistance(Pattern,String[i:i+len(Pattern)]) < HD:
            HD = HammingDistance(Pattern,String[i:i+len(Pattern)])
            motif = String[i:i+len(Pattern)]
       # print (motif)
    print ('this is minHD and Motif')
    print ([HD,motif])
    return [HD,motif]


#sumOfMinHD(Pattern, Dna) as the sum of distances between Pattern and all strings in Dna
#Dna is a collection of strings of the same length

def sumOfMinHD(Pattern,seqLines):
    sumHD = 0
    print(seqLines)
    for i in range (0,len(seqLines)-1):
        minHD = minHDandMotif(Pattern,seqLines[i])[0]
        sumHD = sumHD + minHD
    print ('this is sum Of MinHD')
    print (sumHD)
    return sumHD
######################
def MedianString(seqLines,k):
    print('BBBBBBBBBBBBBB')
    distance = 100000000000000000000000000000000
    print('distance=')
    print(distance)
    Median = []
    pattern = []
    sumHD = 0
    for i in range (0, 4**k-1):
        pattern = NumberToPattern(i,k)
        print('this is pattern')
        print (pattern)
        sumHD = sumOfMinHD(pattern,seqLines)
        print('sumHD=',type(sumHD))
        print(sumHD)
        print('distance=',type(distance))
        print(distance)
        if distance > sumHD:
            print('distance>sumOfMinHD')
            distance = sumOfMinHD(pattern,seqLines)
            Median = pattern
            print('distance=')
            print(distance)
        else:
            print('distance is <=sumOfMinHD')
    return Median
####################  
Median = MedianString(seqLines,k)
print ('this is MedianString')
print (Median)
MedianString(seqLines,k)中循环最后一次递归的控制台:
this is pattern
TTG
['AAATTGACGCAT', 'GACGACCACGTT', 'CGTCAGCGCCTG', 'GCTGAGCACCGG', 'AGTACGGGACAG']
this is minHD and Motif
[0, 'TTG']
this is minHD and Motif
[2, 'ACG']
this is minHD and Motif
[1, 'CTG']
this is minHD and Motif
[1, 'CTG']
this is sum Of MinHD
4
sumHD= <class 'int'>
4
distance= <class 'int'>
2
distance is <=sumOfMinHD
this is MedianString
ACC

0 个答案:

没有答案