示例输入:
>Sample GJVT7LS03DEUKL
AAACTCCGCAATGCGCGCAAGC
>Sample GJVT7LS03CXJ53
AAACTCCGCAATGCGCGCAAGCGTGACGGGG
>Sample GJVT7LS03DJOYJ
AAACTCC
>Sample GJVT7LS03DMERH
AAACTCCGCAATGCGCGCAAGCGTGACGGGGGGAC
>Sample GJVT7LS03DN2RB
AAACTCCGCAATGCGCGCAAGCGTGACGG
我想要的是什么:
>Sample_1 GJVT7LS03DEUKL
AAACTCCGCAATGCGCGCAAGC
>Sample_2 GJVT7LS03CXJ53
AAACTCCGCAATGCGCGCAAGCGTGACGGGG
>Sample_3 GJVT7LS03DJOYJ
AAACTCC
>Sample_4 GJVT7LS03DMERH
AAACTCCGCAATGCGCGCAAGCGTGACGGGGGGAC
>Sample_5 GJVT7LS03DN2RB
AAACTCCGCAATGCGCGCAAGCGTGACGG
换句话说,我想为每个匹配模式的行添加count(以“_”开头)(在本例中为“Sample”)。任何sed / awk /等。这个任务的单行?
答案 0 :(得分:4)
一种方式:
$ awk '/^>/{$1=$1"_"++i}1' file
>Sample_1 GJVT7LS03DEUKL
AAACTCCGCAATGCGCGCAAGC
>Sample_2 GJVT7LS03CXJ53
AAACTCCGCAATGCGCGCAAGCGTGACGGGG
>Sample_3 GJVT7LS03DJOYJ
AAACTCC
>Sample_4 GJVT7LS03DMERH
AAACTCCGCAATGCGCGCAAGCGTGACGGGGGGAC
>Sample_5 GJVT7LS03DN2RB
AAACTCCGCAATGCGCGCAAGCGTGACGG
答案 1 :(得分:2)
一种可能的尝试如下:
$ awk 'BEGIN{a=1}/Sample/ {$1=$1"_"a; a++}1' file
>Sample_1 GJVT7LS03DEUKL
AAACTCCGCAATGCGCGCAAGC
>Sample_2 GJVT7LS03CXJ53
AAACTCCGCAATGCGCGCAAGCGTGACGGGG
>Sample_3 GJVT7LS03DJOYJ
AAACTCC
>Sample_4 GJVT7LS03DMERH
AAACTCCGCAATGCGCGCAAGCGTGACGGGGGGAC
>Sample_5 GJVT7LS03DN2RB
AAACTCCGCAATGCGCGCAAGCGTGACGG
对于包含“Sample”的每个文件,我们使用"_"$variable
更新第一个字段。这个变量a
最初设置为1,然后我们以1为增量。