给出.txt文件(DNA序列比对报告),格式为:
5463784 reads; of these:
5463784 (100.00%) were paired; of these:
841569 (15.40%) aligned concordantly 0 times
4469608 (81.80%) aligned concordantly exactly 1 time
152607 (2.79%) aligned concordantly >1 times
----
841569 pairs aligned 0 times concordantly or discordantly; of these:
1683138 mates make up the pairs; of these:
1407028 (83.60%) aligned 0 times
226521 (13.46%) aligned exactly 1 time
49589 (2.95%) aligned >1 times
87.12% overall alignment rate
获取特定行的子部分的最简单和最简单的方法是什么?例如,如果我想要准确地抓住'我可以使用的行:
awk '/exactly/{print}'
哪会回来:
4469608 (81.80%) aligned concordantly exactly 1 time
226521 (13.46%) aligned exactly 1 time
但是我不确定如何分解返回的内容以获取数组中的4469608
和226521
(最后总结为一起)以将变量设置为4696129
。
答案 0 :(得分:1)
awk '/exactly/ {sum=sum+$1;}END{print sum}' dna
对那些确切存在的行执行操作,将第一列的值存储在名为sum的awk
变量中,并存储在最终打印中。