我编写了一个脚本,该脚本使用子程序调用给定序列中的核苷酸百分比。当我运行脚本时,每个核苷酸百分比的输出总是显示为零。
这是我的代码;
#!/usr/bin/perl
use strict;
use warnings;
#### Subroutine to report percentage of each nucleotide in DNA sequence ####
my $input = $ARGV[0];
my $nt = $ARGV[1];
my $args = $#ARGV +1;
if($args != 2){
print "Error!!! Insufficient number of arguments\n";
print "Usage: $0 <input fasta file>\n";
}
my($FH, $line);
open($FH, '<', $input) || die "Could\'nt open file: $input\n";
$line = do{
local $/;
<$FH>;
};
$line =~ s/>(.*)//g;
$line =~ s/\s+//g;
my $perc = perc_nucleotide($line , $nt);
printf("The percentage of $nt nucleotide in given sequence is %.0f", $perc);
print "\n";
sub perc_nucleotide {
my($line, $nt) = @_;
print "$nt\n";
my $count = 0;
if( $nt eq "A" || $nt eq "T" || $nt eq "G" || $nt eq "C"){
$count++;
}
my $total_len = length($line);
my $perc = ($count/$total_len)*100;
}
我认为我将$count
变量设置错误。我尝试了不同的方法,但无法弄明白。
这是输入文件
>XM_024894547.1 Trichoderma citrinoviride Redoxin (BBK36DRAFT_1163529), partial mRNA
ATGGCCTTCCGTCTCCCTCTGCGCCGCATTGCCCTGGCCCGCCCCGCCACCGTTGCGCGTGGCTTCCACT
CGACGCCCCGCGCCCTGGTCAAGGTCGGCGACGAGGTCCCGAGCTTGGAGCTGTTCGAGAAGTCGGCCGC
CAGCAAGATCAACCTGGCCGACGAGTTCAAGAAGGGCGACGGCTACATTGTCGGCGTCCCGGGCGCCTTC
TCCGGCACCTGCTCCGGCACCCACGTCCCGTCGTACATCAACCACCCTGACATCAAGACGGCCGGCCAGG
TCTTTGTCGTCTCCGTCAACGACCCCTTTGTCATGAAGGCTTGGGCAGACCAGCTGGATCCCGCCGGAGA
GACAGGAATCCGGTTCGTTGCCGACCCCACGGCTGAGTTCACAAAGGCTCTGGAACTGGGATTCGACGAC
GCTGCTCCTCTGTTCGGAGGCACCCGAAGCAAGCGCTATGCTCTCAAGGTTAAGGATGGCAAGGTCACTG
CCGCCTTTGTTGAGCCCGACAACACGGGCACTTCCGTGTCAATGGCCGACAAGGTCCTCAGCTAA
答案 0 :(得分:4)
问题在于:
my $perc = perc_nucleotide($line , $nt);
printf("The percentage of $nt nucleotide in given sequence is %.0f", $perc);
perc_nucleotide
正在返回0.18018018018018
,但格式%.0f
表示打印时没有小数位。所以它被截断为0.你应该使用更像%.2f
的东西。
值得注意的是perc_nucleotide
没有return
。它仍然有效,但原因可能并不明显。
perc_nucleotide
设置my $perc = ($count/$total_len)*100;
但从不使用$perc
。主程序中的$perc
是一个不同的变量。
perc_nucleotide
会返回一些内容,每个没有显式返回的Perl子例程都会返回“最后一次计算的表达式”。在这种情况下,它是my $perc = ($count/$total_len)*100;
,但最后评估的表达式规则可能会有点棘手。
更容易阅读,更安全,有明确的回报。 return ($count/$total_len)*100;
答案 1 :(得分:0)
我更正了剧本,它给了我正确的答案。
#!/usr/bin/perl
use strict;
use warnings;
##### Subroutine to calculate percentage of all nucleotides in a DNA sequence #####
my $input = $ARGV[0];
my $nt = $ARGV[1];
my $args = $#ARGV + 1;
if($args != 2){
print "Error!!! Insufficient number of arguments\n";
print "Usage: $0 <input_fasta_file> <nucleotide>\n";
}
my($FH, $line);
open($FH, '<', $input) || die "Couldn\'t open input file: $input\n";
$line = do{
local $/;
<$FH>;
};
chomp $line;
#print $line;
$line =~ s/>(.*)//g;
$line =~ s/\s+//g;
#print "$line\n";
my $total_len = length($line);
my $perc_of_nt = perc($line, $nt);
**printf("The percentage of nucleotide $nt in a given sequence is %.2f%%", $perc_of_nt);
print "\n";**
#print "$total_len\n";
sub perc{
my($line, $nt) = @_;
my $char; my $count = 0;
**foreach $char (split //, $line){
if($char eq $nt){
$count += 1;
}
}**
**return (($count/$total_len)*100)**
}
上述输入文件的答案是:
Total_len = 555
The percentage of nucleotide A in a given sequence is 18.02%
The percentage of nucleotide T in a given sequence is 18.74%
The percentage of nucleotide G in a given sequence is 28.47%
我所做的更改是粗体。
感谢您的惊人见解!!!