我试图从NCBI Entrez Gene数据库中检索并保存基因摘要,并希望保留uid,但是,尽管它在那里,我找不到正确的方法从结果中检索它。见下文(注意:显然不是我在这里使用的有效电子邮件地址):
from Bio import Entrez
Entrez.email = "bogus@bogus.com"
handle = Entrez.esummary(db="gene", id="79001")
record = Entrez.read(handle)
handle.close()
for k in record["DocumentSummarySet"]['DocumentSummary'][0].keys():
print k
这些是关键:
Status,NomenclatureSymbol,OtherDesignations,Mim,Name,NomenclatureName,CurrentID,GenomicInfo,OtherAliases,Summary,GeneWeight,GeneticSource,MapLocation,ChrSort,ChrStart,LocationHist,Organism,NomenclatureStatus,Chromosome,Description
但如果您查看元素本身(record["DocumentSummarySet"]['DocumentSummary'][0])
,最后会注意到attributes={u'uid': u'79001'}
:
DictElement(
{u'Status': '0',
u'NomenclatureSymbol': 'VKORC1',
u'OtherDesignations': 'phylloquinone epoxide reductase',
u'Mim': ['608547'],
u'Name': 'VKORC1',
u'NomenclatureName': 'vitamin K epoxide reductase complex subunit 1',
u'CurrentID': '0',
u'GenomicInfo': [
{u'ChrAccVer': 'NC_000016.10',
u'ChrLoc': '16',
u'ExonCount': '4',
u'ChrStop': '31090841',
u'ChrStart': '31094998'}],
u'OtherAliases': 'EDTP308, MST134, MST576, VKCFD2, VKOR',
u'Summary': 'This gene [...] variants. [provided by RefSeq, Aug 2015]',
u'GeneWeight': '46017',
u'GeneticSource': 'genomic',
u'MapLocation': '16p11.2',
u'ChrSort': '16',
u'ChrStart': '31090841',
u'LocationHist': [
{u'AssemblyAccVer': 'GCF_000001405.33',
u'ChrAccVer': 'NC_000016.10',
u'AnnotationRelease': '108',
u'ChrStop': '31090841',
u'ChrStart': '31094998'}],
u'Organism': {
u'CommonName': 'human',
u'ScientificName': 'Homo sapiens',
u'TaxID': '9606'},
u'NomenclatureStatus': 'Official',
u'Chromosome': '16',
u'Description': 'vitamin K epoxide reductase complex subunit 1'},
attributes={u'uid': u'79001'})
但'属性'不是关键之一。我还没有找到一种方法来访问属性中保存的uid。有人会有想法吗?
答案 0 :(得分:3)
attributes
只是DictElement的一个属性,您可以使用标准点访问它:
record["DocumentSummarySet"]['DocumentSummary'][0].attributes