在尝试创建CA原子列表时,我收到以下错误"键错误' CA'执行以下代码时

时间:2016-02-29 09:52:35

标签: python biopython sequence-alignment

对于以下代码,当我执行代码时出现错误,我在下面列出了这个错误。我想知道是否有人可以给我任何关于如何将CA原子附加到tag_atoms / tagged_atoms列表的见解,我将用于对齐。并强调代码编写方式中的任何潜在缺陷,我将忽略。我是python的新手,所以任何见解都会很棒并且非常有帮助。

def loadPDB(pdb_name):

    folder = pdb_name[1:3]
    pdbl = PDB.PDBList()
    pdbl.retrieve_pdb_file(pdb_name)
    parser = PDB.PDBParser(PERMISSIVE=1)
    structure = parser.get_structure(
        pdb_name, folder + "/pdb" + pdb_name + ".ent")

    return structure

def alignCoordinates(taggedProtein, potentialTag):
    for model in taggedProtein:
        firstModel = model
        break
    for chain in firstModel:
        firstChain = chain
        break

    for firstChain in firstModel:
        tagged_atoms = []
        tag_atoms    = []

        for residue in firstChain:
            tagged_res = residue

        for tagged_res in firstChain:
            tagged_atoms.append(firstChain['CA'])

    for model in potentialTag:
        firstTagModel = model
        break

    for chain in firstTagModel:
        firstTagChain = chain
        break

    for residue in firstTagChain:
        tag_res = residue

        for tag_res in firstTagChain:
            tag_atoms.append(firstTagChain['CA'])

    super_imposer = Bio.PDB.Superimposer()
    print repr(tagged_atoms)
    print repr(tag_atoms)
    super_imposer.set_atoms(tagged_atoms, tag_atoms)
    super_imposer.apply(tag_model.get_atoms())

    print super_imposer.rms

    io = Bio.PDB.PDBIO()
    io.set_structure(tag_model)
    io.save("Aligned.PDB")

def main():

    pdb1 = "2lyz"
    pdb2 = "4abn"

    potentialTag  = loadPDB(pdb1)
    taggedProtein = loadPDB(pdb2)

    alignCoordinates(taggedProtein, potentialTag)

main()

这是以下错误消息:

Structure exists: '/Users/Azi_Ts/Desktop/ly/pdb2lyz.ent' 
Structure exists: '/Users/Azi_Ts/Desktop/ab/pdb4abn.ent' 
/Library/Python/2.7/site-packages/Bio/PDB/StructureBuilder.py:87:          PDBConstructionWarning: WARNING: Chain A is discontinuous at line 13957.
  PDBConstructionWarning)
/Library/Python/2.7/site-packages/Bio/PDB/StructureBuilder.py:87:   PDBConstructionWarning: WARNING: Chain B is discontinuous at line 14185.
  PDBConstructionWarning)

Traceback (most recent call last):
  File "alignPDB.py", line 76, in <module>
    main()
  File "alignPDB.py", line 74, in main
    alignCoordinates(taggedProtein, potentialTag)
  File "alignPDB.py", line 39, in alignCoordinates
    tagged_atoms.append(firstChain['CA'])
  File "/Library/Python/2.7/site-packages/Bio/PDB/Chain.py", line 70, in    __getitem__
    return Entity.__getitem__(self, id)
  File "/Library/Python/2.7/site-packages/Bio/PDB/Entity.py", line 38, in   __getitem__
    return self.child_dict[id]
KeyError: 'CA'

1 个答案:

答案 0 :(得分:0)

要获得所有CA个原子,您只需要这样做:

ca_atoms = [atom for atom in taggedProtein.get_atoms() if atom.name=="CA"]

请记住,加载的结构taggedProteinpotentialTag有三种方法可能对此有用:get_chains()get_residues()get_atoms()。使用这三个,您可以摆脱for中的每个def alignCoordinates()循环。