Python歪斜基因组

时间:2016-09-19 09:20:57

标签: python dictionary skew

我正在尝试编写偏斜的基因组功能,但不断收到错误:

Failed test #2.
Test Dataset: AGCGTGCCGAAATATGCCGCCAGACCTGCTGCGGTGGCCTCGCCGACTTCACGGATGCCAAGTGCATAGAGGAAGCGAGCAAAGGTGGTTTCTTTCGCTTTATCCAGCGCGTTAACCACGTTCTGTGCCGACTTT
Your output: ['0', '0']
Correct output: ['0', '0', '1', '0', '1', '1', '2', '1', '0', '1', '1', '1', '1', '1', '1', '1', '2', '1', '0', '1', '0', '-1', '-1', '0', '0', '-1', '-2', '-2', '-1', '-2', '-2', '-1', '-2', '-1', '0', '0', '1', '2', '1', '0', '0', '-1', '0', '-1', '-2', '-1', '-1', '-2', '-2', '-2', '-3', '-3', '-4', '-3', '-2', '-2', '-2', '-1', '-2', '-3', '-3', '-3', '-2', '-2', '-1', '-2', '-2', '-2', '-2', '-1', '-1', '0', '1', '1', '1', '2', '1', '2', '2', '3', '2', '2', '2', '2', '3', '4', '4', '5', '6', '6', '6', '6', '5', '5', '5', '5', '4', '5', '4', '4', '4', '4', '4', '4', '3', '2', '2', '3', '2', '3', '2', '3', '3', '3', '3', '3', '2', '1', '1', '0', '1', '1', '1', '0', '0', '1', '1', '2', '1', '0', '1', '1', '0', '0', '0', '0'] 

我的代码:

Genome = "CATGGGCATCGGCCATACGCC"
def SymbolArray(Genome, symbol):
    array = {}
    n = len(Genome)
    ExtendedGenome = Genome + Genome[0:n//2]
    for i in range(n):
        array[i] = PatternCount(symbol, ExtendedGenome[i:i+(n//2)])
    return array
def Skew(Genome):
    skew = {}
    skew[0]=0
    n = len(Genome)
    for i in range(1, n+1):       
        skew[i] = skew[i-1]
        if Genome[i-1] == "G": 
            skew[i] = skew[i-1]+1
        elif Genome[i-1] == "C":
            skew[i] = skew[i-1]-1 
        else:
            skew[i] = skew[i-1]
        return skew
    for i in skew.items():
        Skew(Genome)

2 个答案:

答案 0 :(得分:1)

问题比你做的更简单。最大的问题似乎是:你的return语句在循环中而不是在它之后;你正在使用你想要一个数组的字典;范围的结束是1;你有一个不必要的递归调用Skew()

以下是您的代码的简化工作:

Genome = "AGCGTGCCGAAATATGCCGCCAGACCTGCTGCGGTGGCCTCGCCGACTTCACGGATGCCAAGTGCATAGAGGAAGCGAGCAAAGGTGGTTTCTTTCGCTTTATCCAGCGCGTTAACCACGTTCTGTGCCGACTTT"

def Skew(genome):
    skew = [0]

    for i in range(1, len(genome)):       
        skew.append(skew[-1])

        if genome[i - 1] == "G": 
            skew[i] = skew[i - 1] + 1
        elif genome[i - 1] == "C":
            skew[i] = skew[i - 1] - 1

    return skew

print(Skew(Genome))
  

你能让我知道我可以用字典形式使用它吗?

如果您希望skew容器成为字典,就像在原始容器中一样,您可以执行以下操作:

def Skew(genome):
    skew = {0:0}

    for i in range(1, len(genome)):

        if genome[i - 1] == "G":
            skew[i] = skew[i - 1] + 1
        elif genome[i - 1] == "C":
            skew[i] = skew[i - 1] - 1
        else:
            skew[i] = skew[i - 1]

    return [value for (key, value) in sorted(skew.items())]
但是,我不推荐它。字典通常用于表示稀疏数组,但这不是这种情况。实现此目的的另一种方法是使用OrderedDict - 它允许您避免列表理解并简单地返回skew.values()

答案 1 :(得分:0)

def Skew(Genome):
    skew = {}

    for base in range(1, len(Genome)+1): # since we start at 1 and not 0 as we should we are adding one to the length

        if Genome[base - 1] == "G":  # subtracting one since we start at one so base at the 0 position  has to be included 
            skew[base] = skew[base - 1] + 1
        elif Genome[base - 1] == "C":
            skew[base] = skew[base - 1] - 1
        else:
            skew[base] = skew[base - 1]  


    return skew