Question

我正在使用python 2.7中的一个简单的核苷酸计数器，在我编写的一种方法中，我想打印g，c，a，t值按照它们显示的次数排序在基因表中。 什么是更好的方法呢？提前谢谢！

 def counting(self):
    gene = open("BRCA1.txt", "r")
    g = 0
    a = 0
    c = 0
    t = 0
    gene.readline()
    for line in gene:
        line = line.lower()
        for char in line:
            if char == "g":
                g += 1
            if char == "a":
                a += 1
            if char == "t":
                t += 1
            if char == "c":
                c += 1
    print "number of g\'s: %r" % str(g)
    print "number of c\'s: %r" % str(c)
    print "number of d\'s: %r" % str(a)
    print "number of t\'s: %r" % str(t)

Answer 1

使用Counter类。

from collections import Counter
counter = Counter(char for line in gene for char in line.lower() )
for char, count in counter.most_common():
    print "number of %s\'s: %d" % (char, count)

Answer 2

使用collections.Counter类。

from collections import Counter
def counting(self):
    with open("BRCA1.txt", "r") as gene:
        nucleotide_counts = Counter(char for line in gene for char in line.lower().strip())
    for (nucleotide, count) in nucleotide_counts.most_common():
        print "number of %s's: %d" % (nucleotide, count)

如果你的系列可能包含核苷酸以外的东西，这应该有效：

from collections import Counter
def counting(self):
    nucleotides = frozenset(('g', 'a', 't', 'c'))
    with open("BRCA1.txt", "r") as gene:
        nucleotide_counts = Counter(char for line in gene for char in line.lower() if char in nucleotides)
    for (nucleotide, count) in nucleotide_counts.most_common():
        print "number of %s's: %d" % (nucleotide, count)

该版本不需要strip，因为检查集合成员资格会排除换行符和其他空格。

排序值（for循环）

2 个答案: