我试图从FASTA格式的文件中计算字符串中某些字符的百分比。所以文件看起来像这样;
>label
sequence
>label
sequence
>label
sequence
我试图从"序列"计算特定字符(例如G' s)的百分比。字符串。 在计算完之后(我能够做到),我试图打印一句话说:"(例如)标签1中G的百分比是(例如)53% "
所以我的问题是,如何对序列字符串进行计算,然后通过上面的标签在相应的输出中命名每个字符串?
到目前为止我的代码计算了百分比,但我无法识别它。
#!/usr/bin/perl
use strict;
# opens file
my $infile = "Lab1_seq.fasta.txt";
open INFILE, $infile or die "$infile: $!\n";
# reads each line
while (my $line = <INFILE>){
chomp $line;
#creates an array
my @seq = split (/>/, $line);
# Calculates percent
if ($line !~ />/){
my $G = ($line =~ tr/G//);
my $C = ($line =~ tr/C//);
my $total = $G + $C;
my $length = length($line);
my $percent = ($total / $length) * 100;
#prints the percentage of G's and C's for label is x%
print "The percentage of G's and C's for @seq[1] is $percent\n";
}
else{
}
}
close INFILE
当我真的试图让它也说出与序列对应的每个标签的名称时,它会吐出这个输出(下面)
The percentage of G's and C's for is 53.4868841970569
The percentage of G's and C's for is 52.5443110348771
The percentage of G's and C's for is 50.8746355685131
答案 0 :(得分:1)
您只需匹配您的标签并将其保存在变量中:
my $label;
# reads each line
while (my $line = <INFILE>){
...
if ($line =~ />(.*)/){
$label = $1;
# Calculates percent
} else{
...
print "The percentage of G's and C's for $label, @seq[1] is $percent\n";
}
}