我想知道下面的代码可能出现什么问题,以及为什么我收到错误KeyError:' [' ?
该程序旨在将输入的DNA序列翻译成RNA序列,然后从RNA []中存储的RNA序列翻译产生来自dict的AMINO ACID序列。
由于
DNA = "ACAAGATGCCATTGTCCCCCGGCCTCCTGCTGCTGCTGCTCTCCGGGGCCACGGCCACCGCTGCCCTGC"
RNA = []
AMINO_ACIDS = {"UUU":"F", "UUC":"F", "UUA":"L", "UUG":"L",
"UCU":"S", "UCC":"s", "UCA":"S", "UCG":"S",
"UAU":"Y", "UAC":"Y", "UAA":"STOP", "UAG":"STOP",
"UGU":"C", "UGC":"C", "UGA":"STOP", "UGG":"W",
"CUU":"L", "CUC":"L", "CUA":"L", "CUG":"L",
"CCU":"P", "CCC":"P", "CCA":"P", "CCG":"P",
"CAU":"H", "CAC":"H", "CAA":"Q", "CAG":"Q",
"CGU":"R", "CGC":"R", "CGA":"R", "CGG":"R",
"AUU":"I", "AUC":"I", "AUA":"I", "AUG":"M",
"ACU":"T", "ACC":"T", "ACA":"T", "ACG":"T",
"AAU":"N", "AAC":"N", "AAA":"K", "AAG":"K",
"AGU":"S", "AGC":"S", "AGA":"R", "AGG":"R",
"GUU":"V", "GUC":"V", "GUA":"V", "GUG":"V",
"GCU":"A", "GCC":"A", "GCA":"A", "GCG":"A",
"GAU":"D", "GAC":"D", "GAA":"E", "GAG":"E",
"GGU":"G", "GGC":"G", "GGA":"G", "GGG":"G",}
RNA_2 = str(RNA)
for char in DNA:
if char == "G":
RNA.append("C")
elif char == "C":
RNA.append("G")
elif char == "A":
RNA.append("U")
elif char == "T":
RNA.append("A")
translated = ''.join(AMINO_ACIDS[i] for i in RNA_2)
print("DNA sequence: " + DNA)
print()
print("Length of DNA sequence in base pairs: " + str(len(DNA)))
print()
print("RNA sequence of DNA sequence: " +("".join(RNA)))
print()
print("AMINO ACID sequence: " + str(translated))
答案 0 :(得分:0)
您不需要RNA_2
,但您确实需要一种方法将RNA字符串拆分为三个字符串的块。借用this post中的块函数:
def chunks(l, n):
""" Yield successive n-sized chunks from l.
"""
for i in xrange(0, len(l), n):
yield l[i:i+n]
DNA = "ACAAGATGCCATTGTCCCCCGGCCTCCTGCTGCTGCTGCTCTCCGGGGCCACGGCCACCGCTGCCCTGC"
RNA = []
AMINO_ACIDS = {"UUU":"F", "UUC":"F", "UUA":"L", "UUG":"L",
"UCU":"S", "UCC":"s", "UCA":"S", "UCG":"S",
"UAU":"Y", "UAC":"Y", "UAA":"STOP", "UAG":"STOP",
"UGU":"C", "UGC":"C", "UGA":"STOP", "UGG":"W",
"CUU":"L", "CUC":"L", "CUA":"L", "CUG":"L",
"CCU":"P", "CCC":"P", "CCA":"P", "CCG":"P",
"CAU":"H", "CAC":"H", "CAA":"Q", "CAG":"Q",
"CGU":"R", "CGC":"R", "CGA":"R", "CGG":"R",
"AUU":"I", "AUC":"I", "AUA":"I", "AUG":"M",
"ACU":"T", "ACC":"T", "ACA":"T", "ACG":"T",
"AAU":"N", "AAC":"N", "AAA":"K", "AAG":"K",
"AGU":"S", "AGC":"S", "AGA":"R", "AGG":"R",
"GUU":"V", "GUC":"V", "GUA":"V", "GUG":"V",
"GCU":"A", "GCC":"A", "GCA":"A", "GCG":"A",
"GAU":"D", "GAC":"D", "GAA":"E", "GAG":"E",
"GGU":"G", "GGC":"G", "GGA":"G", "GGG":"G",}
for char in DNA:
if char == "G":
RNA.append("C")
elif char == "C":
RNA.append("G")
elif char == "A":
RNA.append("U")
elif char == "T":
RNA.append("A")
translated = ''.join(AMINO_ACIDS[i] for i in chunks("".join(RNA), 3))
print("DNA sequence: " + DNA)
print()
print("Length of DNA sequence in base pairs: " + str(len(DNA)))
print()
print("RNA sequence of DNA sequence: " +("".join(RNA)))
print()
print("AMINO ACID sequence: " + str(translated))
结果:
DNA sequence: ACAAGATGCCATTGTCCCCCGGCCTCCTGCTGCTGCTGCTCTCCGGGGCCACGGCCACCGCTGCCCTGC
()
Length of DNA sequence in base pairs: 69
()
RNA sequence of DNA sequence: UGUUCUACGGUAACAGGGGGCCGGAGGACGACGACGACGAGAGGCCCCGGUGCCGGUGGCGACGGGACG
()
AMINO ACID sequence: CSTVTGGRRTTTTRGPGAGGDGT
关于原始错误的更多信息。我想你可能会误解RNA_2 = str(RNA)
的作用。它并不意味着“现在和将来,RNA_2将成为RNA的字符串版本,并且每当RNA发生变化时都会保持最新状态”。这意味着“及时将RNA的内容转化为一串,这就是RNA_2的含义,即使RNA后来发生变化”。因此,即使您已将值附加到RNA,RNA_2
也将为“[]”。这是KeyError的来源。 “[”是RNA_2
的第一个字符,而AMINO_ACIDS
中不存在“[”。
但即使您在之后完成了追加循环后RNA_2 = str(RNA)
,我也认为它不会为您提供您想要的结果。它将是['U', 'G', 'U', 'U', 'C', ...
而不是"UGUUC"
。如果你想要后者,你应该使用"".join(RNA)
而不是str(RNA)
。
但是,即使您使用"".join(RNA)
,迭代它并尝试访问AMINO_ACIDS
也行不通,因为AMINO_ACID
的键长度都是三个字符,并且迭代string一次给你一个字符。这就是chunk
的用武之地,让你一次迭代三个字符。