连接两列并粘贴合并列

时间:2016-08-08 11:09:17

标签: shell unix awk

我有一个制表符分隔文件,如下所示 -

 loci1  loci2   name1   name2
    utr3p   utr3p   TERF1   ISCA2
    utr3p   intron  LPP PAAF1
    utr3p   intron  RPL37A  RCC1
    coding  intron  BAG2    RP11
    intron  intron  KIF1B   SNORA21
    intron  downstream  GUSBP4  CTD
    intron  intron  CLTC    VMP1
    utr3p   utr3p   PCYT1A  ZHX3

我想连接两列name1和name2(由“__”连接)。合并后的列应该作为新列“merged_names”粘贴到新文件中。我怎么能用awk做到这一点。

预期产出 -

loci1   loci2   name1   name2   merged_names
utr3p   utr3p   TERF1   ISCA2   TERF1__ISCA2
utr3p   intron  LPP PAAF1   LPP__PAAF1
utr3p   intron  RPL37A  RCC1    RPL37A__RCC1
coding  intron  BAG2    RP11    BAG2__RP11
intron  intron  KIF1B   SNORA21 KIF1B__SNORA21
intron  downstream  GUSBP4  CTD GUSBP4__CTD
intron  intron  CLTC    VMP1    CLTC__VMP1
utr3p   utr3p   PCYT1A  ZHX3    PCYT1A__ZHX3

2 个答案:

答案 0 :(得分:3)

awk 'BEGIN{OFS="\t"; print "loci1  loci2   name1   name2 MERGED__NAMES"} {print $1,$2,$3,$4,$3 "__" $4}' infile
loci1  loci2   name1   name2 MERGED__NAMES
loci1   loci2   name1   name2   name1__name2
utr3p   utr3p   TERF1   ISCA2   TERF1__ISCA2
utr3p   intron  LPP     PAAF1   LPP__PAAF1
utr3p   intron  RPL37A  RCC1    RPL37A__RCC1
coding  intron  BAG2    RP11    BAG2__RP11
intron  intron  KIF1B   SNORA21 KIF1B__SNORA21
intron  downstream      GUSBP4  CTD     GUSBP4__CTD
intron  intron  CLTC    VMP1    CLTC__VMP1
utr3p   utr3p   PCYT1A  ZHX3    PCYT1A__ZHX3

答案 1 :(得分:2)

您可以使用此<host>:<port>

awk

更短的awk 'BEGIN{OFS=FS="\t"} NR==1{$(NF+1)="merged_names"} NR!=1{$(NF+1)=$(NF-1) "__" $NF}1' file

awk