我有一个PDB文件,如下所示。我想计算残留物的数量。第4列是残基名称,第6列是残基位置。
file1.pdb
ATOM 1 N ASN A 2 18.359 26.869 52.955 1.00 39.17 N
ATOM 2 CA ASN A 2 19.635 26.632 53.671 1.00 38.01 C
ATOM 5 N LEU A 3 20.916 28.708 54.068 1.00 32.39 N
ATOM 6 CA LEU A 3 21.304 29.943 54.753 1.00 28.83 C
ATOM 7 C LEU A 3 20.084 30.834 54.955 1.00 25.23 C
ATOM 13 N LYS A 4 19.824 31.394 56.099 1.00 23.92 N
ATOM 14 CA LYS A 4 18.654 32.292 56.333 1.00 21.94 C
ATOM 15 C LYS A 5 19.164 33.678 56.668 1.00 20.25 C
file2.pdb
ATOM 1 N ASN A 2 18.359 26.869 52.955 1.00 39.17 N
ATOM 2 CA ASN A 2 19.635 26.632 53.671 1.00 38.01 C
ATOM 5 N LEU A 3 20.916 28.708 54.068 1.00 32.39 N
ATOM 6 CA LEU A 3 21.304 29.943 54.753 1.00 28.83 C
ATOM 7 C LEU A 3 20.084 30.834 54.955 1.00 25.23 C
ATOM 13 N LYS A 4 19.824 31.394 56.099 1.00 23.92 N
ATOM 14 CA LYS A 4 18.654 32.292 56.333 1.00 21.94 C
ATOM 15 C LYS A 5 19.164 33.678 56.668 1.00 20.25 C
期望的输出
Total no:of ASN - 2
Total no:of LEU - 2
Total no:of LYS - 4
Total no:of residues - 8
答案 0 :(得分:2)
$ awk '{ a[$4 $6 FILENAME]++ }
END {
for (i in a) { b[substr(i,1,3)]++ }
for (i in b)
{
total+=b[i]
printf "Total no:of %s - %d\n", i, b[i]
}
printf "\nTotal no:of residues - %d\n", total
}' file1.pdb file2.pdb
Total no:of LEU - 2
Total no:of ASN - 2
Total no:of LYS - 4
Total no:of residues - 8