使用for循环从blast中下载结果

时间:2013-12-26 18:16:57

标签: python bioinformatics biopython

我试图从列表中的不同序列中获取结果,因此我的代码是:

from Bio.Blast import NCBIXML

from Bio.Blast import NCBIWWW

lst=['TGCCCCGAAAATGAACTCAGTAAAGAATGACAGTTTCGCAAGACCCGTTGCTTTTTCAGTGCTAGCTAGCTGACTGATCGTAGCTGACGTAGTCTAGCTAGC','ATCGATCGTACTACGTAGCTGATCGTAGCTAGCTAGCTGATCGTAGCTATCGTACGTAGCTGATCGATCGTAGCTGACTGACGTACGTAGCTGATCGTAGCTAGCTAGCTAGCTGATCGATC']

eq="Homo sapiens[Organism]"


for i in range(0,2):

    rslt  = NCBIWWW.qblast("blastn", "nr", lst[i],entrez_query=eq)
    rcrds = NCBIXML.parse(rslt)
    br = rcrds.next()
    for alignment in br.alignments:
        for hsp in alignment.hsps:
            if hsp.expect < 2:

                print "***** RECORD ****"+str(i)
                print "sequence:", alignment.title
                print "E-value:", hsp.expect

输出仅为lsts第二项和str(i)= 0

提供爆炸记录
***** RECORD ****0
sequence: gi|27436767|gb|AF274855.3| Homo sapiens chromosome X clone RP11-366F6 map q28, complete sequence

E-value: 0.197898

任何帮助将不胜感激。

1 个答案:

答案 0 :(得分:1)

嗯....你的代码是正确的!...但是你正在搜索不存在的数据...... 问题是,如果br.alignments为空,则没有输出...(这就是secont lst条目的情况)

>>> br.alignments
[]

看看这个剪片:

from Bio.Blast import NCBIXML
from Bio.Blast import NCBIWWW

lst=['TGCCCCGAAAATGAACTCAGTAAAGAATGACAGTTTCGCAAGACCCGTTGCTTTTTCAGTGCTAGCTAGCTGACTGATCGTAGCTGACGTAGTCTAGCTAGC','ATCGATCGTACTACGTAGCTGATCGTAGCTAGCTAGCTGATCGTAGCTATCGTACGTAGCTGATCGATCGTAGCTGACTGACGTACGTAGCTGATCGTAGCTAGCTAGCTAGCTGATCGATC']
eq="Homo sapiens[Organism]"
# download with lst[1]
print "download start"
rslt  = NCBIWWW.qblast("blastn", "nr", lst[1],entrez_query=eq)
print "download end"
print "parsing start"
rcrds = NCBIXML.parse(rslt)
print "parsing end"
br = rcrds.next()
print "br.alignments"
print br.alignments

你可以为lst [0]尝试相同的剪辑,结果将是以下输出:

>>> br.alignments
[<Bio.Blast.Record.Alignment object at 0x02ACCF90>, <Bio.Blast.Record.Alignment object at 0x02ACCE50>, <Bio.Blast.Record.Alignment object at 0x02ACCCF0>, <Bio.Blast.Record.Alignment object at 0x02ACCE10>, <Bio.Blast.Record.Alignment object at 0x02ACCCD0>, <Bio.Blast.Record.Alignment object at 0x02ACCBF0>, <Bio.Blast.Record.Alignment object at 0x02ACCD70>, <Bio.Blast.Record.Alignment object at 0x02ACC930>, <Bio.Blast.Record.Alignment object at 0x02ACC9B0>, <Bio.Blast.Record.Alignment object at 0x02ACCA90>, <Bio.Blast.Record.Alignment object at 0x02ACCAD0>, <Bio.Blast.Record.Alignment object at 0x02ACCB90>, <Bio.Blast.Record.Alignment object at 0x02ACC850>, <Bio.Blast.Record.Alignment object at 0x02ACC970>]

所以简短的版本是: 输出仅提供str(i)= 0的blast记录,因为对于i = 1,没有DATA! 如果你想查看i = 0的所有条目,你必须删除“hsp.expect&lt; 2”行;

我希望有帮助