我正在使用python 2.7中的一个简单的核苷酸计数器,在我编写的一种方法中,我想打印g,c,a,t值按照它们显示的次数排序在基因表中。 什么是更好的方法呢?提前谢谢!
def counting(self):
gene = open("BRCA1.txt", "r")
g = 0
a = 0
c = 0
t = 0
gene.readline()
for line in gene:
line = line.lower()
for char in line:
if char == "g":
g += 1
if char == "a":
a += 1
if char == "t":
t += 1
if char == "c":
c += 1
print "number of g\'s: %r" % str(g)
print "number of c\'s: %r" % str(c)
print "number of d\'s: %r" % str(a)
print "number of t\'s: %r" % str(t)
答案 0 :(得分:6)
使用Counter类。
from collections import Counter
counter = Counter(char for line in gene for char in line.lower() )
for char, count in counter.most_common():
print "number of %s\'s: %d" % (char, count)
答案 1 :(得分:4)
from collections import Counter
def counting(self):
with open("BRCA1.txt", "r") as gene:
nucleotide_counts = Counter(char for line in gene for char in line.lower().strip())
for (nucleotide, count) in nucleotide_counts.most_common():
print "number of %s's: %d" % (nucleotide, count)
如果你的系列可能包含核苷酸以外的东西,这应该有效:
from collections import Counter
def counting(self):
nucleotides = frozenset(('g', 'a', 't', 'c'))
with open("BRCA1.txt", "r") as gene:
nucleotide_counts = Counter(char for line in gene for char in line.lower() if char in nucleotides)
for (nucleotide, count) in nucleotide_counts.most_common():
print "number of %s's: %d" % (nucleotide, count)
该版本不需要strip
,因为检查集合成员资格会排除换行符和其他空格。