我的文本文件中出现了这些行
>SCRT2_DBD_NNGCAACAGGTGN
0.455331585111 0.0458438972816 0.145508011584 0.353316506023
0.173692317806 0.0247846149283 0.759302422526 0.0422206447403
1.16863332073e-07 0.940983666713 1.16863332073e-07 0.0590160995601
0.00506737765087 7.91765386614e-08 0.988123281671 0.00680926150142
0.0623177863824 0.93243216705 0.000777853090471 0.00447219347766
0.00453077729507 0.995469025719 9.8493017436e-08 9.8493017436e-08
0.507583592195 0.453364643178 0.0180440139317 0.0210077506946
>SNAI2_DBD_NRCAGGTGN
0.455331585111 0.0458438972816 0.145508011584 0.353316506023
0.173692317806 0.0247846149283 0.759302422526 0.0422206447403
>SP1_DBD_GCCMCGCCCMC
0.455331585111 0.0458438972816 0.145508011584 0.353316506023
0.173692317806 0.0247846149283 0.759302422526 0.0422206447403
1.16863332073e-07 0.940983666713 1.16863332073e-07 0.0590160995601
0.00506737765087 7.91765386614e-08 0.988123281671 0.00680926150142
0.0623177863824 0.93243216705 0.000777853090471 0.00447219347766
0.00453077729507 0.995469025719 9.8493017436e-08 9.8493017436e-08
0.507583592195 0.453364643178 0.0180440139317 0.0210077506946
我希望得到这个:
>M_SCRT2
0.455331585111 0.0458438972816 0.145508011584 0.353316506023
0.173692317806 0.0247846149283 0.759302422526 0.0422206447403
1.16863332073e-07 0.940983666713 1.16863332073e-07 0.0590160995601
0.00506737765087 7.91765386614e-08 0.988123281671 0.00680926150142
0.0623177863824 0.93243216705 0.000777853090471 0.00447219347766
0.00453077729507 0.995469025719 9.8493017436e-08 9.8493017436e-08
0.507583592195 0.453364643178 0.0180440139317 0.0210077506946
>M_SNAI2
0.455331585111 0.0458438972816 0.145508011584 0.353316506023
0.173692317806 0.0247846149283 0.759302422526 0.0422206447403
>M_SP1
0.455331585111 0.0458438972816 0.145508011584 0.353316506023
0.173692317806 0.0247846149283 0.759302422526 0.0422206447403
1.16863332073e-07 0.940983666713 1.16863332073e-07 0.0590160995601
0.00506737765087 7.91765386614e-08 0.988123281671 0.00680926150142
0.0623177863824 0.93243216705 0.000777853090471 0.00447219347766
0.00453077729507 0.995469025719 9.8493017436e-08 9.8493017436e-08
0.507583592195 0.453364643178 0.0180440139317 0.0210077506946
我不想手动操作,因为这些太多了。
请帮助awk或perl中的单人班轮。
答案 0 :(得分:3)
使用awk
:
$ awk -F"[>_]" '/^>/{ print ">M_" $2; next }1' file
>M_SCRT2
>M_SNAI2
>M_SP1
>M_SP3
使用perl
:
$ perl -F"[>_]" -lane 'print /^>/ ? ">M_$F[1]" : $_' file
>M_SCRT2
>M_SNAI2
>M_SP1
>M_SP3
答案 1 :(得分:2)
备选方案:
perl -pe 's/>(.*?)_.*/>M_$1/'
perl -pe 's/_.*//;s/>/>M_/'
或其他sed
sed 's/_.*//;s/>/>M_/'
答案 2 :(得分:1)
你可以试试下面的sed命令,
sed 's/^>\([^_]*\).*$/>M_\1/' file
示例:强>
$ sed 's/^>\([^_]*\).*$/>M_\1/' file
>M_SCRT2
>M_SNAI2
>M_SP1
>M_SP3