我有一个名为self的字典.__序列读起来像“ID:DNA sequence”,以下是该字典的一部分
{
'1111758': ('TTAGAGTTTGATCCTGGCTCAGAACGAACGCTGGCGGCAGGCCTAA\n', ''),
'1111762': ('AGAGTTTGATCCTGGCTCAGATTGA\n', ''),
'1111763': ('AGAGTTTGATCCTGGCCTT\n', '')
}
我想计算特定序列ID(some_id)的gc conent。也就是说,如果some_id在字典中,则返回该ID的DNA序列的gc内容;如果some_id不存在,则返回错误消息
P.S。 gc含量=(G + C)/(A + T + G + C)DNA序列
我编写以下代码(该函数在类下)但它给了我错误消息。如果有人能帮助我改进我的代码,我感激不尽
def compute_gc_content(self, some_id=''):
"""compute the gc conent for sequence ID (some_id). If some_id is in the
dictionary, return the gc content of the DNA sequence for that ID; if some_id
does not exist,return an error message"""
self.some_id = some_id
for i in range(len(self.__sequences)):
if self.some_id in self.__sequences.keys():
return (self.some_id.values['G']+self.some_id.values['C'])/float(len(self.__sequences))
else:
return "This ID does not exist"
所以如果我打印compute_gc_content('1111758'),我想打印gc内容的值,比如0.23。
答案 0 :(得分:0)
我不确定我是否理解正确。
def compute_gc_content(self, some_id=''):
if some_id in self.__sequences:
seq = self.__sequences['some_id'][0]
return (seq.count('G')+seq.count('C'))/float(len(seq))
else:
return "This ID does not exist"
无需使用in self.__sequences.keys()
,in self.__sequences
做同样的事情。
答案 1 :(得分:0)
这就是你要找的东西:
import itertools
class gc:
def __init__(self):
self.__sequences = {'1111758': ('TTAGAGTTTGATCCTGGCTCAGAACGAACGCTGGCGGCAGGCCTAA\n', ''), '1111762': ('AGAGTTTGATCCTGGCTCAGATTGA\n', ''), '1111763': ('AGAGTTTGATCCTGGCCTT\n', '')}
def compute_gc_content(self, some_id=''):
"""compute the gc conent for sequence ID (some_id). If some_id is in the
dictionary, return the gc content of the DNA sequence for that ID; if some_id
does not exist,return an error message"""
self.some_id = some_id
for i in range(len(self.__sequences)):
if self.some_id in self.__sequences.keys():
return (float)(self.__sequences[some_id][0].count('G')+self.__sequences[some_id][0].count('C'))/(len(self.__sequences[some_id][0]))
else:
return "This ID does not exist"
print gc().compute_gc_content('1111758')