如何使用Efetch下载_full_ RefSeq记录?

时间:2019-03-20 16:46:21

标签: biopython ncbi genbank

我从Nucleotide db下载完整记录时遇到问题。 我使用:

from Bio import Entrez
from Bio import SeqIO

with Entrez.efetch(db="nuccore", rettype="gb", retmode="full", id="NC_007384") as handle:
    seq_record = SeqIO.read(handle, "gb") 

print(seq_record)

这给了我gb文件的简短版本,所以命令:

seq_record.features

不返回功能。

相比之下,当我使用GenBank ID做同样的事情时,没有问题:

with Entrez.efetch(db="nuccore", rettype="gb", retmode="full", id="CP014768.1") as handle:
    seq_record = SeqIO.read(handle, "gb") 

print(seq_record)

之后,我可以从列表seq_record.features中提取所有带注释的功能。

是否可以使用Efetch下载完整的RefSeq记录?

1 个答案:

答案 0 :(得分:1)

您需要使用style="withparts"或将rettype更改为gbwithparts才能获取所有功能。该table具有一些信息。

>>> from Bio import Entrez
>>> from Bio import SeqIO
>>> Entrez.email = 'someone@email.com'
>>> with Entrez.efetch(db="nuccore", rettype="gb", retmode="full", id="NC_007384") as handle:
...     seq_record = SeqIO.read(handle, "gb") 
... 
>>> len(seq_record.features)
1
>>> with Entrez.efetch(db="nuccore", rettype="gbwithparts", retmode="full", id="NC_007384") as handle:
...     seq_record = SeqIO.read(handle, "gb") 
... 
>>> len(seq_record.features)
10616
>>> with Entrez.efetch(db="nuccore", rettype="gb", style="withparts", retmode="full", id="NC_007384") as handle:
...     seq_record = SeqIO.read(handle, "gb")
... 
>>> len(seq_record.features)
10616