我正在count
(参考)中尝试symbol
$5
( - ),并使用awk
输出重命名的符号和计数。输入文件以制表符分隔,并且下面的awk
已关闭,但输出的数据不正确并且我不知道如何修复它。谢谢你:)。
AWK
awk -F'\t' 'BEGIN {printf "Category\tCount\n" } $5 ~ /-/ {printf "indel" } {a[$5]++} END { for (i in a) {printf "%s\t\t%s\n",i , a[i] }}' input
输入
Index Mutation Call Start End Ref Alt Func.refGene Gene.refGene ExonicFunc.refGene Sanger
13 c.[1035-3T>C]+[1035-3T>C] 166170127 166170127 T C intronic SCN2A
16 c.[2994C>T]+[=] 166210776 166210776 C T exonic SCN2A synonymous SNV
19 c.[4914T>A]+[4914T>A] 166245230 166245230 T A exonic SCN2A synonymous SNV
20 c.[5109C>T]+[=] 166245425 166245425 C T exonic SCN2A synonymous SNV
21 c.[5139C>T]+[=] 166848646 166848646 G A exonic SCN1A synonymous SNV
22 c.3152_3153insAACCACT 166892841 166892841 - AGTGGTT exonic SCN1A frameshift insertion TP
23 c.2044-5delT 166898947 166898947 A - intronic SCN1A
25 c.1530_1531insA 166901684 166901684 - T exonic SCN1A frameshift insertion FP
当前输出
Category Count
indelindelindelindel 5
A 4
C 7
Ref 1
- 4
G 2
T 6
TCCT 1
所需的输出
Category Count
indel 2
答案 0 :(得分:2)
你去......
$ awk -F'\t' '$5=="-"{count++}
END{print "Category","Count";
print "indel",count}' file |
column -t
Category Count
indel 2