有没有更简单的方法来编辑这些行

时间:2014-09-09 16:53:36

标签: perl awk

我的文本文件中出现了这些行

>SCRT2_DBD_NNGCAACAGGTGN
0.455331585111  0.0458438972816 0.145508011584  0.353316506023
0.173692317806  0.0247846149283 0.759302422526  0.0422206447403
1.16863332073e-07       0.940983666713  1.16863332073e-07       0.0590160995601
0.00506737765087        7.91765386614e-08       0.988123281671  0.00680926150142
0.0623177863824 0.93243216705   0.000777853090471       0.00447219347766
0.00453077729507        0.995469025719  9.8493017436e-08        9.8493017436e-08
0.507583592195  0.453364643178  0.0180440139317 0.0210077506946
>SNAI2_DBD_NRCAGGTGN
0.455331585111  0.0458438972816 0.145508011584  0.353316506023
0.173692317806  0.0247846149283 0.759302422526  0.0422206447403
>SP1_DBD_GCCMCGCCCMC
0.455331585111  0.0458438972816 0.145508011584  0.353316506023
0.173692317806  0.0247846149283 0.759302422526  0.0422206447403
1.16863332073e-07       0.940983666713  1.16863332073e-07       0.0590160995601
0.00506737765087        7.91765386614e-08       0.988123281671  0.00680926150142
0.0623177863824 0.93243216705   0.000777853090471       0.00447219347766
0.00453077729507        0.995469025719  9.8493017436e-08        9.8493017436e-08
0.507583592195  0.453364643178  0.0180440139317 0.0210077506946

我希望得到这个:

>M_SCRT2
0.455331585111  0.0458438972816 0.145508011584  0.353316506023
0.173692317806  0.0247846149283 0.759302422526  0.0422206447403
1.16863332073e-07       0.940983666713  1.16863332073e-07       0.0590160995601
0.00506737765087        7.91765386614e-08       0.988123281671  0.00680926150142
0.0623177863824 0.93243216705   0.000777853090471       0.00447219347766
0.00453077729507        0.995469025719  9.8493017436e-08        9.8493017436e-08
0.507583592195  0.453364643178  0.0180440139317 0.0210077506946
>M_SNAI2
0.455331585111  0.0458438972816 0.145508011584  0.353316506023
0.173692317806  0.0247846149283 0.759302422526  0.0422206447403
>M_SP1
0.455331585111  0.0458438972816 0.145508011584  0.353316506023
0.173692317806  0.0247846149283 0.759302422526  0.0422206447403
1.16863332073e-07       0.940983666713  1.16863332073e-07       0.0590160995601
0.00506737765087        7.91765386614e-08       0.988123281671  0.00680926150142
0.0623177863824 0.93243216705   0.000777853090471       0.00447219347766
0.00453077729507        0.995469025719  9.8493017436e-08        9.8493017436e-08
0.507583592195  0.453364643178  0.0180440139317 0.0210077506946

我不想手动操作,因为这些太多了。

请帮助awk或perl中的单人班轮。

3 个答案:

答案 0 :(得分:3)

使用awk

$ awk -F"[>_]" '/^>/{ print ">M_" $2; next }1' file
>M_SCRT2
>M_SNAI2
>M_SP1
>M_SP3

使用perl

$ perl -F"[>_]" -lane 'print /^>/ ? ">M_$F[1]" : $_' file
>M_SCRT2
>M_SNAI2
>M_SP1
>M_SP3

答案 1 :(得分:2)

备选方案:

perl -pe 's/>(.*?)_.*/>M_$1/'
perl -pe 's/_.*//;s/>/>M_/'

或其他sed

sed 's/_.*//;s/>/>M_/'

答案 2 :(得分:1)

你可以试试下面的sed命令,

sed 's/^>\([^_]*\).*$/>M_\1/' file

示例:

$ sed 's/^>\([^_]*\).*$/>M_\1/' file
>M_SCRT2
>M_SNAI2
>M_SP1
>M_SP3