使用awk按字段组合输出

时间:2015-12-19 18:00:23

标签: awk

我有一个使用以下格式的awk创建的文件:

文件

chr2:46603668-46603902 EPAS1-902|gc=54.3 234 bases with an average of 253.1
chr2:211471445-211471675 CPS1-1205|gc=48.3 230 bases with an average of 264.7
chr19:15291762-15291983 NOTCH3-1003|gc=68.8 221 bases with an average of 195.8
chr2:211460199-211460318 CPS1-1200|gc=41.2 119 bases with an average of 105.6

我要做的是将所有匹配的$2组合在一起,然后剥离-。文件中的每一行都有匹配,但示例中未显示。谢谢你:)。

所需的输出

chr2:211471445-211471675 CPS1|gc=48.3 230 bases with an average of 264.7 
chr2:211460199-211460318 CPS1|gc=41.2 119 bases with an average of 105.6
chr2:46603668-46603902 EPAS1-902|gc=54.3 234 bases with an average of 253.1
chr19:15291762-15291983 NOTCH3-1003|gc=68.8 221 bases with an average of 195.8

我试过了:

AWK

awk '{k=$1 FS $2; a[k]+=split[$2] "-"; c[k]++}
END{for(k in a)
      {split(k,ks,FS);
       print ks[1],c[k],ks[2],a[k]/c[k]}}' file > output.txt

0 个答案:

没有答案