满足文件中的三个条件

时间:2013-09-21 06:24:32

标签: python

假设我有以下文件(位于PDB format

     ATOM      1  N   MET A   1      66.104  56.583 -35.505  1.00  0.00           N 
     ATOM      2  CA  MET A   1      66.953  57.259 -36.531  1.00  0.00           C
     ATOM      3  C   MET A   1      67.370  56.262 -37.627  1.00  0.00           C
     ATOM      4  O   MET A   1      67.105  55.079 -37.531  1.00  0.00           O
     ATOM      5  CB  MET A   1      68.227  57.852 -35.867  1.00  0.00           C
     ATOM      6  CG  MET A   1      67.848  58.995 -34.899  1.00  0.00           C
     ATOM      7  SD  MET A   1      66.880  58.593 -33.421  1.00  0.00           S
     ATOM      8  CE  MET A   1      68.253  58.332 -32.269  1.00  0.00           C
     ATOM      9  H1  MET A   1      66.566  56.636 -34.576  1.00  0.00           H
     ATOM     10  H2  MET A   1      65.969  55.585 -35.765  1.00  0.00           H
     ATOM     11  H3  MET A   1      65.179  57.056 -35.460  1.00  0.00           H
     ATOM     12  HA  MET A   1      66.373  58.046 -36.989  1.00  0.00           H
     ATOM     13  HB2 MET A   1      68.743  57.078 -35.317  1.00  0.00           H
     ATOM     14  HB3 MET A   1      68.894  58.236 -36.625  1.00  0.00           H
     ATOM     15  HG2 MET A   1      68.760  59.479 -34.578  1.00  0.00           H
     ATOM     16  HG3 MET A   1      67.283  59.729 -35.455  1.00  0.00           H
     ATOM     17  HE1 MET A   1      68.880  57.524 -32.617  1.00  0.00           H
     ATOM     18  HE2 MET A   1      67.847  58.062 -31.306  1.00  0.00           H
     ATOM     19  HE3 MET A   1      68.822  59.245 -32.159  1.00  0.00           H
     ATOM     21  CA  ALA A   2      68.498  55.965 -39.793  1.00  0.00           C
     ATOM     22  C   ALA A   2      70.028  56.064 -39.893  1.00  0.00           C
     ATOM     23  O   ALA A   2      70.561  56.995 -40.466  1.00  0.00           O
     ATOM     30  N   THR A   3      70.681  55.084 -39.321  1.00  0.00           N
     ATOM     24  CA  ALA A   2      67.833  56.491 -41.076  1.00  0.00           C
     ATOM     25  H   ALA A   2      68.194  57.752 -38.637  1.00  0.00           H
     ATOM     26  HA  ALA A   2      68.226  54.930 -39.645  1.00  0.00           H
     ATOM     27  HB1 ALA A   2      66.760  56.401 -40.994  1.00  0.00           H
     ATOM     28  HB2 ALA A   2      68.167  55.915 -41.926  1.00  0.00           H
     ATOM     29  HB3 ALA A   2      68.085  57.529 -41.233  1.00  0.00           H
     ATOM     30  N   THR A   3      70.681  55.084 -39.321  1.00  0.00           N
     ATOM     31  CA  THR A   3      72.178  55.028 -39.324  1.00  0.00           C
     ATOM     32  C   THR A   3      72.651  53.933 -40.300  1.00  0.00           C

如果N,CA和C都在文件中顺序存在,我想做一些过程(这个条件对于残基1和3是正确的,但对于残基2是假的)。第6列显示残留数。如果我写得如下。

     if line[0:6]=='ATOM  ':
       if line[12:16]==' N  ' or line[12:16]==' CA ' or line[12:16]==' C  ':
          do some process

但是对于残基2也是如此,其中“N,CA,C”不是顺序的。如果“N,CA,C”在一个序列中,如何修改上面的代码来执行该过程?非常感谢。

1 个答案:

答案 0 :(得分:2)

你可以保留一个三元素的队列

q = [None] * 3
for line in lines:
    if line[0:5] == "ATOM ":
        q[0] = q[1]; q[1] = q[2]; q[2] = line[12:16]
        if q == [" N  ", " CA ", " C  "]:
            ... found a match ...

如果允许其他元素(但您需要N,CA和C的序列),只有在N,CA或C忽略其他元素时才可以输入队列中的元素。

将此处理扩展到同时搜索不同的序列也很容易。