我正在尝试制作一个基本程序,将字符串ATGTACATGGGCATAGCCATATA翻译成其RNA序列,即UACAUGUACCCGUAUCGGUAUAU。
但输出是:CCCCCUUUUUUUUUUUUUUUUUUUUUUUU
但它不太合适。我认为问题是每个字母单独遍历整个字符串。
我对生物信息学编程非常陌生,所以任何建议都会受到欢迎。
DNA = "ATGTACATGGGCATAGCCATATA"
dna_length = len(DNA)
print("DNA: " + DNA)
print()
print("Length of DNA in base pairs: "+ str(dna_length))
RNA = []
for char in DNA:
if char == "G":
RNA.append("C")
for line in DNA:
if char == "C":
RNA.append("G")
for line in DNA:
if char == "A":
RNA.append("U")
for line in DNA:
if char == "T":
RNA.append("A")
print("".join(RNA))
答案 0 :(得分:3)
我会使用dict
来执行替换,然后在join
中使用生成器表达式来执行翻译。
>>> RNA = {'G':'C', 'C':'G', 'A':'U', 'T':'A'}
>>> DNA = 'ATGTACATGGGCATAGCCATATA'
>>> translated = ''.join(RNA[i] for i in DNA)
>>> translated
'UACAUGUACCCGUAUCGGUAUAU'
答案 1 :(得分:3)
问题在于,您在同一列表上循环四次,并且还修改了之前循环中所做的更改。因此,使用具有多个if-else条件的单个循环:
RNA = []
for char in DNA:
if char == "G":
RNA.append("C")
elif char == "C":
RNA.append("G")
elif char == "A":
RNA.append("U")
elif char == "T":
RNA.append("A")
print("".join(RNA))
最佳解决方案是使用str.translate
:
>>> from string import maketrans
>>> s = "ATGTACATGGGCATAGCCATATA"
>>> tab = maketrans('GCAT', 'CGUA')
>>> s.translate(tab)
'UACAUGUACCCGUAUCGGUAUAU'
在Python 3中,我们可以在没有任何导入的情况下完成:
>>> s = "ATGTACATGGGCATAGCCATATA"
>>> s.translate({ord(k): v for k, v in zip('GCAT', 'CGUA')})
'UACAUGUACCCGUAUCGGUAUAU'
答案 2 :(得分:2)
有str.maketrans
从一组字符到另一组字符的转换表,str.translate
将这种映射应用于字符串;这应该是最快的方法,因此
>>> DNA_2_RNA = str.maketrans('CGAT', 'GCUA')
>>> DNA = 'ATGTACATGGGCATAGCCATATA'
>>> RNA = DNA.translate(DNA_2_RNA)
>>> RNA
'UACAUGUACCCGUAUCGGUAUAU'
答案 3 :(得分:0)
更正代码:
DNA = "ATGTACATGGGCATAGCCATATA"
dna_length = len(DNA)
print("DNA: " + DNA)
print()
print("Length of DNA in base pairs: "+ str(dna_length))
RNA = []
for char in DNA:
if char == "G":
RNA.append("C")
elif char == "C":
RNA.append("G")
elif char == "A":
RNA.append("U")
elif char == "T":
RNA.append("A")
print("".join(RNA))
结果:
DNA: ATGTACATGGGCATAGCCATATA
()
Length of DNA in base pairs: 23
UACAUGUACCCGUAUCGGUAUAU