如何获得unprotkb给出pdb id?

时间:2015-05-10 12:28:30

标签: python bioinformatics biopython

我需要获取有关特定蛋白质的长度和结构域结构的信息,例如1btk。为此,我需要获得UniprotKB,我该怎么做?

来自网站http://www.rcsb.org/pdb/explore.do?structureId=1BTK

uniprotkb

UniprotKB是' Q06187'

1 个答案:

答案 0 :(得分:1)

您可以使用urllib2下载pdb文件,然后使用正则表达式提取Uniprot id

url_template = "http://www.rcsb.org/pdb/files/{}.pdb"

protein = "1BTK"
url = url_template.format(protein)

import urllib2
response = urllib2.urlopen(url)
pdb = response.read()
response.close()  # best practice to close the file

import re
m = re.search('UNP\ +(\w+)', pdb)
m.group(1)
# you get 'Q06187'

奖励,如果您希望解析pdb文件:

from Bio.PDB.PDBParser import PDBParser
response = urllib2.urlopen(url)
parser = PDBParser()
structure = parser.get_structure(protein, response)
response.close()  # best practice to close the file
header = parser.get_header()
trailer = parser.get_trailer()
#info about protein in structure, header and trailer