如何在这样的序列中找到number of of each 2 consecutive characters AA, AC,AG,AT,CC,CA...
:
$sequence = 'AACGTACTGACGTACTGGTTGGTACGA'
不允许重叠,即$序列包含从左到右AA CG TA CT ....而不是AA AC CG ......
答案 0 :(得分:5)
@result = $subject =~ m/[ACTG][ATGC]/g;
print scalar(@result);
编辑因为你完全改变了你的问题:
use strict;
my $subject = "AACGTACTGACGTACTGGTTGGTACGA";
my %results = ();
while ($subject =~ m/[ACTG][ATGC]/g) {
# matched text = $&
if(exists $results{$&})
{
$results{$&}++
}
else
{
$results{$&} = 1;
}
}
foreach (sort keys %results) {
print "$_ : $results{$_}\n";
}
输出:
AA : 1
CG : 3
CT : 2
GA : 1
GG : 2
TA : 3
TT : 1
最终编辑:希望...感谢@canavanin
use strict;
my $subject = "AACGTACTGACGTACTGGTTGGTACGA";
my %results = ();
while ($subject =~ m/[ACTG][ATGC]/g) {
# matched text = $&
$results{$&}++
}
foreach (sort keys %results) {
print "$_ : $results{$_}\n";
}