计算pdb文件中的残留数量

时间:2013-05-12 12:14:20

标签: awk

我有一个PDB文件,如下所示。我想计算残留物的数量。第4列是残基名称,第6列是残基位置。

file1.pdb

ATOM      1  N   ASN A   2      18.359  26.869  52.955  1.00 39.17           N 
ATOM      2  CA  ASN A   2      19.635  26.632  53.671  1.00 38.01           C  
ATOM      5  N   LEU A   3      20.916  28.708  54.068  1.00 32.39           N 
ATOM      6  CA  LEU A   3      21.304  29.943  54.753  1.00 28.83           C
ATOM      7  C   LEU A   3      20.084  30.834  54.955  1.00 25.23           C 
ATOM     13  N   LYS A   4      19.824  31.394  56.099  1.00 23.92           N
ATOM     14  CA  LYS A   4      18.654  32.292  56.333  1.00 21.94           C
ATOM     15  C   LYS A   5      19.164  33.678  56.668  1.00 20.25           C 

file2.pdb

ATOM      1  N   ASN A   2      18.359  26.869  52.955  1.00 39.17           N 
ATOM      2  CA  ASN A   2      19.635  26.632  53.671  1.00 38.01           C
ATOM      5  N   LEU A   3      20.916  28.708  54.068  1.00 32.39           N 
ATOM      6  CA  LEU A   3      21.304  29.943  54.753  1.00 28.83           C
ATOM      7  C   LEU A   3      20.084  30.834  54.955  1.00 25.23           C
ATOM     13  N   LYS A   4      19.824  31.394  56.099  1.00 23.92           N
ATOM     14  CA  LYS A   4      18.654  32.292  56.333  1.00 21.94           C
ATOM     15  C   LYS A   5      19.164  33.678  56.668  1.00 20.25           C

期望的输出

Total no:of ASN - 2
Total no:of LEU - 2
Total no:of LYS - 4

Total no:of residues - 8

1 个答案:

答案 0 :(得分:2)

$ awk '{ a[$4 $6 FILENAME]++ }
   END {
     for (i in a) { b[substr(i,1,3)]++ }
     for (i in b)
     {
       total+=b[i]
       printf "Total no:of %s - %d\n", i, b[i]
     }
     printf "\nTotal no:of residues - %d\n", total
   }' file1.pdb file2.pdb
Total no:of LEU - 2
Total no:of ASN - 2
Total no:of LYS - 4

Total no:of residues - 8