Question

我有一个如下所示的DNA序列列表。我想获得在该特定位置获得的最高频率的共识序列。

的test.txt

>human
ATCAATTGCT
>human
GCTAGCTAGC
>human
GCTAGCTAGC
>human
GCTGATCGGC
>human
GCTTACAACG

使用下面的代码，我从每个位置获得总A，C，G和T.

代码

from Bio import motifs
output=open("test_output.txt","a")
with open("test.txt") as handle:
    motif = motifs.read(handle, 'sites')
    output.write(str(motif.counts))

示例输出

        0      1      2      3      4      5      6      7      8      9
A:   1.00   0.00   0.00   3.00   3.00   0.00   1.00   3.00   0.00   0.00
C:   0.00   4.00   1.00   0.00   0.00   3.00   1.00   0.00   2.00   3.00
G:   4.00   0.00   0.00   1.00   2.00   0.00   0.00   2.00   3.00   1.00
T:   0.00   1.00   4.00   1.00   0.00   2.00   3.00   0.00   0.00   1.00

如何获得最后一栏中所述的每个碱基的输出？

所需的输出

    0      1      2      3      4      5      6      7      8      9
A:   1.00   0.00   0.00   3.00   3.00   0.00   1.00   3.00   0.00   0.00
C:   0.00   4.00   1.00   0.00   0.00   3.00   1.00   0.00   2.00   3.00
G:   4.00   0.00   0.00   1.00   2.00   0.00   0.00   2.00   3.00   1.00
T:   0.00   1.00   4.00   1.00   0.00   2.00   3.00   0.00   0.00   1.00
     G      C      T      A      A      C      T      A      G      C

Answer 1

Motifs有一种共识方法可以完全符合您的要求：

 output.write("\t".join(list(motif.consensus)))

Python：如何根据获得的最大值输出变量？

1 个答案: