我有一些文本文件,如下所示。我需要计算每个文件中的残留总数。第6列代表残留数字。
FILE1.TXT
ATOM 19 CA LYS C 323 2.648 17.703 45.442 1.00 17.46 C
ATOM 20 C LYS C 323 1.884 18.118 46.688 1.00 17.13 C
ATOM 21 O LYS C 323 0.822 17.576 46.996 1.00 17.54 O
ATOM 28 CA ARG C 324 1.835 19.574 48.632 1.00 16.33 C
ATOM 29 C ARG C 324 1.990 21.084 48.733 1.00 16.43 C
ATOM 45 N LYS C 326 2.321 24.344 50.724 1.00 16.55 N
ATOM 46 CA LYS C 326 2.843 24.570 52.063 1.00 15.26 C
ATOM 62 N ASP C 328 1.791 25.643 56.502 1.00 22.19 N
ATOM 63 CA ASP C 328 2.336 25.657 57.860 1.00 23.53 C
FILE2.TXT
ATOM 12 CG GLN B 670 52.075 84.009 47.855 1.00 97.39 C
ATOM 13 CD GLN B 670 51.068 83.904 46.726 1.00 98.36 C
ATOM 14 OE1 GLN B 670 51.239 84.504 45.665 1.00100.00 O
ATOM 16 N SER B 671 49.664 86.399 49.090 1.00 88.49 N
ATOM 17 CA SER B 671 48.384 87.100 49.166 1.00 79.72 C
期望输出
Total no:of residues in file1.txt : 4
Total no:of residues in file2.txt : 2
答案 0 :(得分:2)
使用这个awk one-liner:
awk '{a[$6]} END{print "Total no:of residues in", FILENAME, ":", length(a)}' file
替代非gnu awk解决方案:
awk '{a[$6]} END{for (i in a) s++;print "Total no:of residues in", FILENAME, ":",s}' file
答案 1 :(得分:0)
试试这个,未经测试:
awk '
!seen[FILENAME,$6]++ { numRes[FILENAME]++ }
END {
for (fileName in numRes) {
printf "Total no:of residues in %s : %d\n", fileName, numRes[fileName]
}
}
' file1.txt file2.txt