我愿意编写一个程序来提取对应于“Region”类型特征的氨基酸序列作为单独的Fasta文件,并列出具有site_type =“磷酸化”的“Site”的氨基酸和位置。
不使用Biopython PACKAGE。
(我的biopython code已经做了同样的事情)
文件在下面。
LOCUS NP_005219 1210 aa linear PRI 15-MAR-2015 DEFINITION epidermal growth factor receptor isoform a precursor [Homo sapiens]. ACCESSION NP_005219 VERSION NP_005219.2 GI:29725609 DBSOURCE REFSEQ: accession NM_005228.3 KEYWORDS RefSeq. FEATURES Location/Qualifiers source 1..1210 /organism="Homo sapiens" /db_xref="taxon:9606" /chromosome="7" /map="7p12" Protein 1..1210 /product="epidermal growth factor receptor isoform a precursor" /EC_number="2.7.10.1" /note="avian erythroblastic leukemia viral (v-erb-b) oncogene homolog; cell proliferation-inducing protein 61; cell growth inhibiting protein 40; proto-oncogene c-ErbB-1; receptor tyrosine-protein kinase erbB-1" sig_peptide 1..24 /inference="COORDINATES: ab initio prediction:SignalP:4.0" /calculated_mol_wt=2283 mat_peptide 25..1210 /product="epidermal growth factor receptor isoform a" /calculated_mol_wt=132013 Region 57..168 /region_name="Recep_L_domain" /note="Receptor L domain; pfam01030" /db_xref="CDD:250307" Region 75..300 /region_name="Approximate" /experiment="experimental evidence, no additional details recorded" /note="propagated from UniProtKB/Swiss-Prot (P00533.2)" Region 185..337 /region_name="Furin-like" /note="Furin-like cysteine rich region; pfam00757" /db_xref="CDD:250112" Site 229 /site_type="phosphorylation" /experiment="experimental evidence, no additional details recorded" /note="Phosphoserine. {ECO:0000269|PubMed:21487020}; propagated from UniProtKB/Swiss-Prot (P00533.2)" Region 231..274 /region_name="FU" /note="Furin-like repeats. Cysteine rich region. Exact function of the domain is not known. Furin is a serine-kinase dependent proprotein processor. Other members of this family include endoproteases and cell surface receptors; cd00064" /db_xref="CDD:238021" Region 361..481 /region_name="Recep_L_domain" /note="Receptor L domain; pfam01030" /db_xref="CDD:250307" Region 390..600 /region_name="Approximate" /experiment="experimental evidence, no additional details recorded" /note="propagated from UniProtKB/Swiss-Prot (P00533.2)" Region 505..637 /region_name="GF_recep_IV" /note="Growth factor receptor domain IV; pfam14843" /db_xref="CDD:258980" Region 506..559 /region_name="FU" /note="Furin-like repeats. Cysteine rich region. Exact function of the domain is not known. Furin is a serine-kinase dependent proprotein processor. Other members of this family include endoproteases and cell surface receptors; cd00064" /db_xref="CDD:238021" Region 558..>598 /region_name="FU" /note="Furin-like repeats. Cysteine rich region. Exact function of the domain is not known. Furin is a serine-kinase dependent proprotein processor. Other members of this family include endoproteases and cell surface receptors; cd00064" /db_xref="CDD:238021" Region 634..677 /region_name="TM_ErbB1" /note="Transmembrane domain of Epidermal Growth Factor Receptor or ErbB1, a Protein Tyrosine Kinase; cd12093" /db_xref="CDD:213054" Site order(644..646,648..653,656..657) /site_type="other" /note="heterodimer interface [polypeptide binding]" /db_xref="CDD:213054" Site 646..668 /site_type="transmembrane region" /experiment="experimental evidence, no additional details recorded" /note="propagated from UniProtKB/Swiss-Prot (P00533.2)" Site 678 /site_type="phosphorylation" /experiment="experimental evidence, no additional details recorded" /note="Phosphothreonine, by PKC and PKD/PRKD1. {ECO:0000269|PubMed:10523301}; propagated from UniProtKB/Swiss-Prot (P00533.2)" Region 688..704 /region_name="Important for dimerization, phosphorylation and activation" /experiment="experimental evidence, no additional details recorded" /note="propagated from UniProtKB/Swiss-Prot (P00533.2)" Site 693 /site_type="phosphorylation" /experiment="experimental evidence, no additional details recorded" /note="Phosphothreonine, by PKD/PRKD1. {ECO:0000269|PubMed:10523301, ECO:0000269|PubMed:16083266, ECO:0000269|PubMed:18691976, ECO:0000269|PubMed:20068231, ECO:0000269|PubMed:3138233}; propagated from UniProtKB/Swiss-Prot (P00533.2)" Site 695 /site_type="phosphorylation" /experiment="experimental evidence, no additional details recorded" /note="Phosphoserine. {ECO:0000269|PubMed:18691976, ECO:0000269|PubMed:3138233}; propagated from UniProtKB/Swiss-Prot (P00533.2)" Region 704..1016 /region_name="PTKc_EGFR" /note="Catalytic domain of the Protein Tyrosine Kinase, Epidermal Growth Factor Receptor; cd05108" /db_xref="CDD:270683" Region 712..968 /region_name="Pkinase_Tyr" /note="Protein tyrosine kinase; pfam07714" /db_xref="CDD:254379" Site order(715..717,728..730,794..795,797,804..805,1009..1010) /site_type="other" /note="dimer interface [polypeptide binding]" /db_xref="CDD:270683" Site order(718..719,722..723,745,791,793,797,841..842,855, 876..880,885,889) /site_type="active" /db_xref="CDD:270683" Site order(718..719,726,743,745,766,790..791,793,841..842,844, 855) /site_type="other" /note="ATP binding site [chemical binding]" /db_xref="CDD:270683" Site 854..879 /site_type="other" /note="activation loop (A-loop)" /db_xref="CDD:270683" Site order(876..880,885,889) /site_type="other" /note="polypeptide substrate binding site [polypeptide binding]" /db_xref="CDD:270683" Site 991 /site_type="phosphorylation" /experiment="experimental evidence, no additional details recorded" /note="Phosphoserine. {ECO:0000269|PubMed:16083266, ECO:0000269|PubMed:18669648, ECO:0000269|PubMed:20068231}; propagated from UniProtKB/Swiss-Prot (P00533.2)" Site 995 /site_type="phosphorylation" /experiment="experimental evidence, no additional details recorded" /note="Phosphoserine. {ECO:0000269|PubMed:18669648}; propagated from UniProtKB/Swiss-Prot (P00533.2)" Site 998 /site_type="phosphorylation" /experiment="experimental evidence, no additional details recorded" /note="Phosphotyrosine, by autocatalysis. {ECO:0000269|PubMed:18669648, ECO:0000269|PubMed:19563760}; propagated from UniProtKB/Swiss-Prot (P00533.2)" Site 1016 /site_type="other" /experiment="experimental evidence, no additional details recorded" /note="Important for interaction with PIK3C2B; propagated from UniProtKB/Swiss-Prot (P00533.2)" Site 1016 /site_type="phosphorylation" /experiment="experimental evidence, no additional details recorded" /note="Phosphotyrosine, by autocatalysis. {ECO:0000269|PubMed:19563760}; propagated from UniProtKB/Swiss-Prot (P00533.2)" Site 1026 /site_type="phosphorylation" /experiment="experimental evidence, no additional details recorded" /note="Phosphoserine. {ECO:0000269|PubMed:16083266}; propagated from UniProtKB/Swiss-Prot (P00533.2)" Site 1039 /site_type="phosphorylation" /experiment="experimental evidence, no additional details recorded" /note="Phosphoserine. {ECO:0000269|PubMed:18669648}; propagated from UniProtKB/Swiss-Prot (P00533.2)" Site 1041 /site_type="phosphorylation" /experiment="experimental evidence, no additional details recorded" /note="Phosphothreonine. {ECO:0000269|PubMed:18669648}; propagated from UniProtKB/Swiss-Prot (P00533.2)" Site 1042 /site_type="phosphorylation" /experiment="experimental evidence, no additional details recorded" /note="Phosphoserine. {ECO:0000269|PubMed:18669648}; propagated from UniProtKB/Swiss-Prot (P00533.2)" Site 1064 /site_type="phosphorylation" /experiment="experimental evidence, no additional details recorded" /note="Phosphoserine. {ECO:0000269|PubMed:18669648, ECO:0000269|PubMed:18691976, ECO:0000269|PubMed:20068231}; propagated from UniProtKB/Swiss-Prot (P00533.2)" Site 1069 /site_type="phosphorylation" /experiment="experimental evidence, no additional details recorded" /note="Phosphotyrosine. {ECO:0000305|PubMed:22888118}; propagated from UniProtKB/Swiss-Prot (P00533.2)" Site 1070 /site_type="phosphorylation" /experiment="experimental evidence, no additional details recorded" /note="Phosphoserine. {ECO:0000269|PubMed:3138233}; propagated from UniProtKB/Swiss-Prot (P00533.2)" Site 1071 /site_type="phosphorylation" /experiment="experimental evidence, no additional details recorded" /note="Phosphoserine. {ECO:0000269|PubMed:3138233}; propagated from UniProtKB/Swiss-Prot (P00533.2)" Site 1081 /site_type="phosphorylation" /experiment="experimental evidence, no additional details recorded" /note="Phosphoserine. {ECO:0000269|PubMed:18691976}; propagated from UniProtKB/Swiss-Prot (P00533.2)" Site 1092 /site_type="phosphorylation" /experiment="experimental evidence, no additional details recorded" /note="Phosphotyrosine, by autocatalysis. {ECO:0000269|PubMed:12873986}; propagated from UniProtKB/Swiss-Prot (P00533.2)" Site 1110 /site_type="phosphorylation" /experiment="experimental evidence, no additional details recorded" /note="Phosphotyrosine, by autocatalysis. {ECO:0000269|PubMed:12873986, ECO:0000269|PubMed:2543678}; propagated from UniProtKB/Swiss-Prot (P00533.2)" Site 1166 /site_type="phosphorylation" /experiment="experimental evidence, no additional details recorded" /note="Phosphoserine. {ECO:0000269|PubMed:18669648, ECO:0000269|PubMed:18691976}; propagated from UniProtKB/Swiss-Prot (P00533.2)" Site 1172 /site_type="phosphorylation" /experiment="experimental evidence, no additional details recorded" /note="Phosphotyrosine, by autocatalysis. {ECO:0000269|PubMed:17081983}; propagated from UniProtKB/Swiss-Prot (P00533.2)" Site 1197 /site_type="phosphorylation" /experiment="experimental evidence, no additional details recorded" /note="Phosphotyrosine, by autocatalysis. {ECO:0000269|PubMed:17081983, ECO:0000269|PubMed:18691976, ECO:0000269|PubMed:19563760, ECO:0000269|PubMed:19836242, ECO:0000269|PubMed:20068231}; propagated from UniProtKB/Swiss-Prot (P00533.2)" Site 1199 /site_type="methylation" /experiment="experimental evidence, no additional details recorded" /note="Omega-N-methylarginine. {ECO:0000269|PubMed:21258366}; propagated from UniProtKB/Swiss-Prot (P00533.2)" CDS 1..1210 /gene="EGFR" /gene_synonym="ERBB; ERBB1; HER1; mENA; NISBD2; PIG61" /coded_by="NM_005228.3:247..3879" /note="isoform a precursor is encoded by transcript variant 1" /db_xref="CCDS:CCDS5514.1" /db_xref="GeneID:1956" /db_xref="HGNC:HGNC:3236" /db_xref="MIM:131550" ORIGIN 1 mrpsgtagaa llallaalcp asraleekkv cqgtsnkltq lgtfedhfls lqrmfnncev 61 vlgnleityv qrnydlsflk tiqevagyvl ialntverip lenlqiirgn myyensyala 121 vlsnydankt glkelpmrnl qeilhgavrf snnpalcnve siqwrdivss dflsnmsmdf 181 qnhlgscqkc dpscpngscw gageencqkl tkiicaqqcs grcrgkspsd cchnqcaagc 241 tgpresdclv crkfrdeatc kdtcpplmly npttyqmdvn pegkysfgat cvkkcprnyv 301 vtdhgscvra cgadsyemee dgvrkckkce gpcrkvcngi gigefkdsls inatnikhfk 361 nctsisgdlh ilpvafrgds fthtppldpq eldilktvke itgflliqaw penrtdlhaf 421 enleiirgrt kqhgqfslav vslnitslgl rslkeisdgd viisgnknlc yantinwkkl 481 fgtsgqktki isnrgensck atgqvchalc spegcwgpep rdcvscrnvs rgrecvdkcn 541 llegeprefv enseciqchp eclpqamnit ctgrgpdnci qcahyidgph cvktcpagvm 601 genntlvwky adaghvchlc hpnctygctg pglegcptng pkipsiatgm vgalllllvv 661 algiglfmrr rhivrkrtlr rllqerelve pltpsgeapn qallrilket efkkikvlgs 721 gafgtvykgl wipegekvki pvaikelrea tspkankeil deayvmasvd nphvcrllgi 781 cltstvqlit qlmpfgclld yvrehkdnig sqyllnwcvq iakgmnyled rrlvhrdlaa 841 rnvlvktpqh vkitdfglak llgaeekeyh aeggkvpikw malesilhri ythqsdvwsy 901 gvtvwelmtf gskpydgipa seissilekg erlpqppict idvymimvkc wmidadsrpk 961 freliiefsk mardpqrylv iqgdermhlp sptdsnfyra lmdeedmddv vdadeylipq 1021 qgffsspsts rtpllsslsa tsnnstvaci drnglqscpi kedsflqrys sdptgalted 1081 siddtflpvp eyinqsvpkr pagsvqnpvy hnqplnpaps rdphyqdphs tavgnpeyln 1141 tvqptcvnst fdspahwaqk gshqisldnp dyqqdffpke akpngifkgs taenaeylrv 1201 apqssefiga //
答案 0 :(得分:0)
我建议使用biopython
from Bio import SeqIO
file = "file.gb"
#gb = next(SeqIO.parse(open(file), "genbank")) in python 3
gb = SeqIO.parse(open(file), "gb").next()
phosphorylation_list = [f for f in gb.features if f.type=="Site" and
"phosphorylation" in f.qualifiers['site_type']]
for f in phosphorylation_list:
print((int(f.location.start), int(f.location.end)))
你明白了,
(228, 229) (677, 678) (692, 693) (694, 695) (990, 991) (994, 995) (997, 998) (1015, 1016) (1025, 1026) (1038, 1039) (1040, 1041) (1041, 1042) (1063, 1064) (1068, 1069) (1069, 1070) (1070, 1071) (1080, 1081) (1091, 1092) (1109, 1110) (1165, 1166) (1171, 1172) (1196, 1197)