我有一个使用以下格式的awk
创建的文件:
文件
chr2:46603668-46603902 EPAS1-902|gc=54.3 234 bases with an average of 253.1
chr2:211471445-211471675 CPS1-1205|gc=48.3 230 bases with an average of 264.7
chr19:15291762-15291983 NOTCH3-1003|gc=68.8 221 bases with an average of 195.8
chr2:211460199-211460318 CPS1-1200|gc=41.2 119 bases with an average of 105.6
我要做的是将所有匹配的$2
组合在一起,然后剥离-
。文件中的每一行都有匹配,但示例中未显示。谢谢你:)。
所需的输出
chr2:211471445-211471675 CPS1|gc=48.3 230 bases with an average of 264.7
chr2:211460199-211460318 CPS1|gc=41.2 119 bases with an average of 105.6
chr2:46603668-46603902 EPAS1-902|gc=54.3 234 bases with an average of 253.1
chr19:15291762-15291983 NOTCH3-1003|gc=68.8 221 bases with an average of 195.8
我试过了:
AWK
awk '{k=$1 FS $2; a[k]+=split[$2] "-"; c[k]++}
END{for(k in a)
{split(k,ks,FS);
print ks[1],c[k],ks[2],a[k]/c[k]}}' file > output.txt