用于从字符串生成子字符串并替换某些字符的函数

时间:2016-03-12 15:58:08

标签: python string recursion

我有一个长度为n的字符串。例如:

s = "gcgcagagacgcaagcctaRgggSgggggttgggggggcgtgt"

我想要一个子串:

s1 = s[0:20]

结果:

s1 = "gcgcagagacgcaagcctaR"

然后检查它是否有任何不是" a"," c"," g"或" t"。这对于s​​1来说是正确的,因为它以" R"

结束

下一步是更换" R"用" A"和" G" (或" S"用" C"和" G"),即创建两个新字符串:

"gcgcagagacgcaagcctaA"
"gcgcagagacgcaagcctaG"

然后采用s的新子串:

s1 = s[1:21]

重复此操作,直到我到达原始字符串的末尾。在这个例子中将是:

s1 = s[23:43]

如果子字符串中有两个特殊字符,则会生成4个新字符串。如果三则那么8等等。如果为零特殊字符,则按原样打印子字符串并向前移动。 比示例中有更多特殊字符,但重点仍然相同。

到目前为止我所拥有的:

def generate_substrings(sequence, start, end):
    codes = set("MRWSYKVHDB")
    s = sequence[start:end]
    start += 1
    end += 1

    if end > len(sequence):
        return

    elif not any((nt in codes) for nt in s):
        print(s)

    else:
        for i, nt in enumerate(s):
            if nt not in "acgt":
                if nt == "R":
                    s = s.replace("R", "A")
                    print(s)
                    return generate_substrings(sequence, start, end)
                    s = s.replace("R", "G")
                    return generate_substrings(sequence, start, end)
                elif nt == "S":
                    s = s.replace("S", "C")
                    return generate_substrings(sequence, start, end)
                    s = s.replace("S", "G")
                    return generate_substrings(sequence, start, end)

generate_substrings("gcgcagagacgcaagcctaRgggSgggggttgggggggcgtgt", 0, 20)

我知道这个剧本并不能满足我的需要,但它是我现在所拥有的,如果有人可以帮助我扩展(或重写)它,我将非常感激。

1 个答案:

答案 0 :(得分:2)

def generate_substrings(sequence):
    length = len(sequence)
    for i in range(len(sequence)- 19):
        currentSequence = sequence[i:i+20]
        recursiveReplaceLetter(currentSequence)

def recursiveReplaceLetter(s):
    isOk = True
    for i in range(len(s)):
        if (s[i] == "R"):
            isOk = False
            newSequence1 = s
            newSequence1 = newSequence1[:i+1].replace("R", "A") + newSequence1[i+1:]
            recursiveReplaceLetter(newSequence1)

            newSequence2 = s
            newSequence2 = newSequence2[:i+1].replace("R", "G") + newSequence2[i+1:]
            recursiveReplaceLetter(newSequence2)
            break

        elif(s[i] == "S"):
            isOk = False
            newSequence1 = s
            newSequence1 = newSequence1[:i+1].replace("S", "C") + newSequence1[i+1:]
            recursiveReplaceLetter(newSequence1)

            newSequence2 = s
            newSequence2 = newSequence2[:i+1].replace("S", "G") + newSequence2[i+1:]
            recursiveReplaceLetter(newSequence2)
            break


    if (isOk):
        print (s)

sequence="gRRcagagacgcaagcctaRgggSgggggttgggggggcgtgt"
generate_substrings(sequence)