所以我应该设计一个计算DNA序列的程序,并计算各个碱基对。这是我到目前为止所拥有的:
class dnaString (str):
def __new__(self,s):
return str.__new__(self,s.upper())
def length (self):
return (len(self))
def getATCG (self,num_A,num_T,num_C,num_G):
num_A = self.count("A")
num_T = self.count("T")
num_C = self.count ("C")
num_G = self.count ("G")
return ( (self.length(), num_A, num_T, num_G, num_C) )
def printnum_A (self):
print ("Adenine base content: {0}".format(self.count("A")))
dna = input("Enter a dna sequence: ")
x=dnaString(dna)
该程序并没有真正做任何事情,因为我刚刚开始使用python,我不知道如何解决这个问题,所以它有效。我还应该添加什么?我知道它还未完成。
答案 0 :(得分:1)
我不确定问题是什么,但是因为你没有调用方法'printnum_A`,所以什么都没有打印。如果你这样称呼它,它可以工作:
dna = input("Enter a dna sequence: ")
x=dnaString(dna)
x.printnum_A()
根据评论更新
声明类的方法是不够的,您还需要在需要时调用它们。就像printnum_T
:
class dnaString (str):
def __new__(self,s):
return str.__new__(self,s.upper())
def length (self):
return (len(self))
def getATCG (self,num_A,num_T,num_C,num_G):
num_A = self.count("A")
num_T = self.count("T")
num_C = self.count ("C")
num_G = self.count ("G")
return ( (self.length(), num_A, num_T, num_G, num_C) )
def printnum_A (self):
print ("Adenine base content: {0}".format(self.count("A")))
# here the method is declared
def printnum_T (self):
print ("Adenine base content: {0}".format(self.count("T")))
dna = input("Enter a dna sequence: ")
x=dnaString(dna)
x.printnum_A()
# Here I call my method on `x`
x.printnum_T()
答案 1 :(得分:0)
这有帮助吗?它适用于我碰巧安装的Python 2.7.3和3.2.3。
import itertools
import sys
def pairwise(iterable):
"s -> (s0,s1), (s1,s2), (s2, s3), ..."
a, b = itertools.tee(iterable)
next(b, None)
if sys.version_info[0] > 2:
return zip(a,b)
return itertools.izip(a, b)
class DnaSequence():
Names = {
'A' : 'adenine',
'C' : 'cytosine',
'G' : 'guanine',
'T' : 'thymine'
}
Bases = Names.keys()
def __init__(self, seq):
self._string = seq
self.bases = { x:0 for x in DnaSequence.Bases }
self.pairs = { x+y:0 for x in DnaSequence.Bases
for y in DnaSequence.Bases }
for base in seq:
if base in self.bases:
self.bases[base] += 1
for x,y in pairwise(seq):
pair = x+y
if pair in self.pairs:
self.pairs[pair] += 1
def printCount(self, base):
if base in DnaSequence.Names:
print(DnaSequence.Names[base].capitalize() +
" base content: " + str(self.bases[base]))
else:
sys.stderr.write('No such base ("%s")\n' % base)
def __repr__(self):
return self._string
d = DnaSequence("CCTAGTGTTAGCTAGTCTAGGGAT")
for base in DnaSequence.Bases:
d.printCount(base)
# Further:
print(d)
print(d.bases)
print(d.pairs)
它是计算基数(A,C,G,T)和相邻对的所有出现的完整示例(例如在ACCGTA中,对AC,CC,CG,GT,TA都将为1,笛卡尔积ACGT x ACGT的其他11种可能组合都是0)。
此处使用的计数方法在构造函数中扫描字符串一次,而不是每次调用getATGC()
时扫描它四次。
答案 2 :(得分:0)
我认为课程可以简化一下:
class DnaString(str):
def __new__(self, s):
return str.__new__(self, s.strip().upper())
def __init__(self, _):
self.num_A = self.count("A")
self.num_C = self.count("C")
self.num_G = self.count("G")
self.num_T = self.count("T")
def stats(self):
return len(self), self.num_A, self.num_C, self.num_G, self.num_T
然后
dna = raw_input("Enter a dna sequence: ")
d = DnaString(dna)
print(d)
print(d.stats())
给出
Enter a dna sequence: ACGTACGTA
ACGTACGTA
(9, 3, 2, 2, 2)
答案 3 :(得分:-1)
您可以使用字典来组织和检索您的计数。例如:
DNASeq = raw_input("Enter a DNA sequence: ")
SeqLength = len(DNASeq)
print 'Sequence Length:', SeqLength
BaseKey = list(set(DNASeq)) #creates a list from the unique characters in the DNASeq
Dict = {}
for char in BaseKey:
Dict[char] = DNASeq.count(char)
print Dict