如何在Biopython中创建多序列对齐列表?

时间:2016-04-07 15:43:36

标签: biopython

可能是一个简单的问题,但我在创建一个简单的MultipleSeqAlignment对象列表时遇到了麻烦。

from Bio import AlignIO
import Bio.Align

#Read multi-aligned fasta file
alignment = AlignIO.read(sys.argv[1], "fasta")

#some testing values
first_POI = 10 #base position
major = "a" #major allele
minor = "g" #minor allele

#create lists of sequence ids that are major or minor allele
align_major = Bio.Align.MultipleSeqAlignment([])
align_minor = Bio.Align.MultipleSeqAlignment([])

for record in alignment:
    if (record.seq[first_POI] == major):
        #compile the sequences that have major allele
        align_major = align_major + record
    elif (record.seq[first_POI] == minor):
        #compile sequences with minor allele
        align_minor = align_minor + record    

我收到此错误:

  File "FindHaplotypes.py", line 53, in <module>
    align_major=align_major+record
  File "C:\Python34\lib\site-packages\Bio\Align\__init__.py", line 385, in __add__
    raise NotImplementedError

所以我有点困惑,我已经可以想象来自MultipleSeqAlignment的{​​{1}}可能与为Bio.Align存储的对象类型不同。我认为既然他们都处理过MSA,他们就会一样。我知道我的Bio.AlignIO个对象我可以像字符串一样将它们一起添加,但问题是我不知道如何初始化一个空的AlignIO对象以上述方式将它们添加到一起。以前我必须通过将第一个记录设置为新变量然后输入for循环来以丑陋的方式进行添加。

1 个答案:

答案 0 :(得分:0)

错误很明显:NotImplementedError__add__MultipleSeqAlignment的Biopython实现要求这两个对象是MultipleSeqAlignment的实例。涉及其他对象的增加没有实现。

您应该使用msa的方法append()

for record in alignment:
    if (record.seq[first_POI]==major):
        #compile the sequences that have major allele
        align_major.append(record)
    elif (record.seq[first_POI]==minor):
        #compile sequences with minor allele
        align_minor.append(record)

这需要recordSeqRecordAlphabetalign_mayoralign_minor相同且长度相同。