我从Nucleotide db下载完整记录时遇到问题。 我使用:
from Bio import Entrez
from Bio import SeqIO
with Entrez.efetch(db="nuccore", rettype="gb", retmode="full", id="NC_007384") as handle:
seq_record = SeqIO.read(handle, "gb")
print(seq_record)
这给了我gb文件的简短版本,所以命令:
seq_record.features
不返回功能。
相比之下,当我使用GenBank ID做同样的事情时,没有问题:
with Entrez.efetch(db="nuccore", rettype="gb", retmode="full", id="CP014768.1") as handle:
seq_record = SeqIO.read(handle, "gb")
print(seq_record)
之后,我可以从列表seq_record.features中提取所有带注释的功能。
是否可以使用Efetch下载完整的RefSeq记录?
答案 0 :(得分:1)
您需要使用style="withparts"
或将rettype
更改为gbwithparts
才能获取所有功能。该table具有一些信息。
>>> from Bio import Entrez
>>> from Bio import SeqIO
>>> Entrez.email = 'someone@email.com'
>>> with Entrez.efetch(db="nuccore", rettype="gb", retmode="full", id="NC_007384") as handle:
... seq_record = SeqIO.read(handle, "gb")
...
>>> len(seq_record.features)
1
>>> with Entrez.efetch(db="nuccore", rettype="gbwithparts", retmode="full", id="NC_007384") as handle:
... seq_record = SeqIO.read(handle, "gb")
...
>>> len(seq_record.features)
10616
>>> with Entrez.efetch(db="nuccore", rettype="gb", style="withparts", retmode="full", id="NC_007384") as handle:
... seq_record = SeqIO.read(handle, "gb")
...
>>> len(seq_record.features)
10616