如何将第一列中的文本追加到所有列linux

时间:2017-05-18 08:18:29

标签: linux bash awk

我有一个大的制表符分隔文件,如下所示:

rhaB:   IENJKMAH_01395  MACAJNEK_00455  OLCKBDOH_04002  PMOGBMCF_03363  ANGDFNGL_03589
exuT_1: OLCKBDOH_00247  EHNKCCHC_00463  MACAJNEK_00987  PMOGBMCF_00492  LPCGNNBB_01394
recA:   OLCKBDOH_01231  MOEFEGAP_03152  JFGDENGL_01411  DNGGHEME_03701  KFALDAGO_00482
lldP:   OLCKBDOH_02876  EHNKCCHC_01431  HHOCJGFI_02180  MACAJNEK_01950  KDLNNIOI_00263

我想将第一列中的文本添加到每列中内容的末尾,以便输出看起来像

rhaB:   IENJKMAH_01395_rhaB MACAJNEK_00455_rhaB OLCKBDOH_04002_rhaB PMOGBMCF_03363_rhaB ANGDFNGL_03589_rhaB

原因是我必须最终删除第一列,我希望能够回溯这些ID。

1 个答案:

答案 0 :(得分:1)

awk 方法:

awk '{suffix=substr($1,1,length($1)-1); for(i=2;i<=NF;i++) $i=$i"_"suffix}1' file

输出:

rhaB: IENJKMAH_01395_rhaB MACAJNEK_00455_rhaB OLCKBDOH_04002_rhaB PMOGBMCF_03363_rhaB ANGDFNGL_03589_rhaB
exuT_1: OLCKBDOH_00247_exuT_1 EHNKCCHC_00463_exuT_1 MACAJNEK_00987_exuT_1 PMOGBMCF_00492_exuT_1 LPCGNNBB_01394_exuT_1
recA: OLCKBDOH_01231_recA MOEFEGAP_03152_recA JFGDENGL_01411_recA DNGGHEME_03701_recA KFALDAGO_00482_recA
lldP: OLCKBDOH_02876_lldP EHNKCCHC_01431_lldP HHOCJGFI_02180_lldP MACAJNEK_01950_lldP KDLNNIOI_00263_lldP

suffix=substr($1,1,length($1)-1) - 获取 1 st列值而不尾随:

for(i=2;i<=NF;i++) $i=$i"_"suffix - 为每个下一列添加后缀值

要获得“美化”列输出,您可以使用column -tx进行管道输入:

awk '{suffix=substr($1,1,length($1)-1); for(i=2;i<=NF;i++) $i=$i"_"suffix}1' file | column -tx

输出:

rhaB:    IENJKMAH_01395_rhaB    MACAJNEK_00455_rhaB    OLCKBDOH_04002_rhaB    PMOGBMCF_03363_rhaB    ANGDFNGL_03589_rhaB
exuT_1:  OLCKBDOH_00247_exuT_1  EHNKCCHC_00463_exuT_1  MACAJNEK_00987_exuT_1  PMOGBMCF_00492_exuT_1  LPCGNNBB_01394_exuT_1
recA:    OLCKBDOH_01231_recA    MOEFEGAP_03152_recA    JFGDENGL_01411_recA    DNGGHEME_03701_recA    KFALDAGO_00482_recA
lldP:    OLCKBDOH_02876_lldP    EHNKCCHC_01431_lldP    HHOCJGFI_02180_lldP    MACAJNEK_01950_lldP    KDLNNIOI_00263_lldP