我有一个大的制表符分隔文件,如下所示:
rhaB: IENJKMAH_01395 MACAJNEK_00455 OLCKBDOH_04002 PMOGBMCF_03363 ANGDFNGL_03589
exuT_1: OLCKBDOH_00247 EHNKCCHC_00463 MACAJNEK_00987 PMOGBMCF_00492 LPCGNNBB_01394
recA: OLCKBDOH_01231 MOEFEGAP_03152 JFGDENGL_01411 DNGGHEME_03701 KFALDAGO_00482
lldP: OLCKBDOH_02876 EHNKCCHC_01431 HHOCJGFI_02180 MACAJNEK_01950 KDLNNIOI_00263
我想将第一列中的文本添加到每列中内容的末尾,以便输出看起来像
rhaB: IENJKMAH_01395_rhaB MACAJNEK_00455_rhaB OLCKBDOH_04002_rhaB PMOGBMCF_03363_rhaB ANGDFNGL_03589_rhaB
原因是我必须最终删除第一列,我希望能够回溯这些ID。
答案 0 :(得分:1)
awk 方法:
awk '{suffix=substr($1,1,length($1)-1); for(i=2;i<=NF;i++) $i=$i"_"suffix}1' file
输出:
rhaB: IENJKMAH_01395_rhaB MACAJNEK_00455_rhaB OLCKBDOH_04002_rhaB PMOGBMCF_03363_rhaB ANGDFNGL_03589_rhaB
exuT_1: OLCKBDOH_00247_exuT_1 EHNKCCHC_00463_exuT_1 MACAJNEK_00987_exuT_1 PMOGBMCF_00492_exuT_1 LPCGNNBB_01394_exuT_1
recA: OLCKBDOH_01231_recA MOEFEGAP_03152_recA JFGDENGL_01411_recA DNGGHEME_03701_recA KFALDAGO_00482_recA
lldP: OLCKBDOH_02876_lldP EHNKCCHC_01431_lldP HHOCJGFI_02180_lldP MACAJNEK_01950_lldP KDLNNIOI_00263_lldP
suffix=substr($1,1,length($1)-1)
- 获取 1 st列值而不尾随:
for(i=2;i<=NF;i++) $i=$i"_"suffix
- 为每个下一列添加后缀值
要获得“美化”列输出,您可以使用column -tx
进行管道输入:
awk '{suffix=substr($1,1,length($1)-1); for(i=2;i<=NF;i++) $i=$i"_"suffix}1' file | column -tx
输出:
rhaB: IENJKMAH_01395_rhaB MACAJNEK_00455_rhaB OLCKBDOH_04002_rhaB PMOGBMCF_03363_rhaB ANGDFNGL_03589_rhaB
exuT_1: OLCKBDOH_00247_exuT_1 EHNKCCHC_00463_exuT_1 MACAJNEK_00987_exuT_1 PMOGBMCF_00492_exuT_1 LPCGNNBB_01394_exuT_1
recA: OLCKBDOH_01231_recA MOEFEGAP_03152_recA JFGDENGL_01411_recA DNGGHEME_03701_recA KFALDAGO_00482_recA
lldP: OLCKBDOH_02876_lldP EHNKCCHC_01431_lldP HHOCJGFI_02180_lldP MACAJNEK_01950_lldP KDLNNIOI_00263_lldP