我有一个制表符分隔文件,如下所示 -
loci1 loci2 name1 name2
utr3p utr3p TERF1 ISCA2
utr3p intron LPP PAAF1
utr3p intron RPL37A RCC1
coding intron BAG2 RP11
intron intron KIF1B SNORA21
intron downstream GUSBP4 CTD
intron intron CLTC VMP1
utr3p utr3p PCYT1A ZHX3
我想连接两列name1和name2(由“__”连接)。合并后的列应该作为新列“merged_names”粘贴到新文件中。我怎么能用awk做到这一点。
预期产出 -
loci1 loci2 name1 name2 merged_names
utr3p utr3p TERF1 ISCA2 TERF1__ISCA2
utr3p intron LPP PAAF1 LPP__PAAF1
utr3p intron RPL37A RCC1 RPL37A__RCC1
coding intron BAG2 RP11 BAG2__RP11
intron intron KIF1B SNORA21 KIF1B__SNORA21
intron downstream GUSBP4 CTD GUSBP4__CTD
intron intron CLTC VMP1 CLTC__VMP1
utr3p utr3p PCYT1A ZHX3 PCYT1A__ZHX3
答案 0 :(得分:3)
awk 'BEGIN{OFS="\t"; print "loci1 loci2 name1 name2 MERGED__NAMES"} {print $1,$2,$3,$4,$3 "__" $4}' infile
loci1 loci2 name1 name2 MERGED__NAMES
loci1 loci2 name1 name2 name1__name2
utr3p utr3p TERF1 ISCA2 TERF1__ISCA2
utr3p intron LPP PAAF1 LPP__PAAF1
utr3p intron RPL37A RCC1 RPL37A__RCC1
coding intron BAG2 RP11 BAG2__RP11
intron intron KIF1B SNORA21 KIF1B__SNORA21
intron downstream GUSBP4 CTD GUSBP4__CTD
intron intron CLTC VMP1 CLTC__VMP1
utr3p utr3p PCYT1A ZHX3 PCYT1A__ZHX3
答案 1 :(得分:2)
您可以使用此<host>:<port>
:
awk
更短的awk 'BEGIN{OFS=FS="\t"} NR==1{$(NF+1)="merged_names"} NR!=1{$(NF+1)=$(NF-1) "__" $NF}1' file
:
awk