我有一个FASTA序列文件,每个文件以不同的标题开头。我需要为每个序列标题添加以1(Seq1,Seq2,Seqn ...)开头的序列号。以下是前三个:
输入:
>[organism=Fowl Adenovirus] Fowl Adenovirus FAdV hexon gene, isolate FAdV/SP/1184/2013
AACTGGATCGCGGAAGACGGTAACAAGACAACCATCACCGGACAAATGTCTAA
>[organism=Fowl Adenovirus] Fowl Adenovirus FAdV hexon gene, isolate FAdV/SP/1184/2013
AACTGGATCGCGGAAGACGGTAACAAGACAACCATCACCGGACAAATGTCTAA
>[organism=Fowl Adenovirus] Fowl Adenovirus FAdV hexon gene, isolate FAdV/SP/1184/2013
AACTGGATCGCGGAAGACGGTAACAAGACAACCATCACCGGACAAATGTCTAA
输出:
>Seq1 [organism=Fowl Adenovirus] Fowl Adenovirus FAdV hexon gene, isolate FAdV/SP/1184/2013
AACTGGATCGCGGAAGACGGTAACAAGACAACCATCACCGGACAAATGTCTAA
>Seq2 [organism=Fowl Adenovirus] Fowl Adenovirus FAdV hexon gene, isolate FAdV/SP/1184/2013
AACTGGATCGCGGAAGACGGTAACAAGACAACCATCACCGGACAAATGTCTAA
>Seq3 [organism=Fowl Adenovirus] Fowl Adenovirus FAdV hexon gene, isolate FAdV/SP/1184/2013
AACTGGATCGCGGAAGACGGTAACAAGACAACCATCACCGGACAAATGTCTAA
答案 0 :(得分:2)
一个awk:
#<OpenStruct gmail=#<Gmail::Client0xbe112w (abcd@y.com) connected>>
答案 1 :(得分:0)
sed 's/>/> /' output_3.fasta > output_4.fasta
perl -pe 'BEGIN { our $i = 1; } s/>/">Seq".($i++)/ge;' < output_4.fasta