Perl:对于散列打印中的每个键,第一个值...在下一个循环中,第二个,依此类推

时间:2015-11-22 14:21:19

标签: perl transpose

所以,

我用生物序列将很多.txt文件数据输入我的@col数组......我需要在每个位置出现几个字母(A,C,G,T)的频率......一切都是很好...它的工作原理,但我想像这样转换输出......

输出是:

A 1.112 1.124 1.258

C 1.154 1.122 1.587

G 1.158 1.454 1.478

T 1.154 1.125 1.478

但是我想转置那个...我的意思是将行分成列......就像

一样

A C G T

1.112 1.154 1.154 1.154

等等

代码:

@col = {GTGTCCATTAGAGGGCGCCA GCAGCCTCCTGAGGACGCCA GAGACCTCAAGGGGCCACTA GGGGCCACTAGGGGGCTCGA ATGGCCACAAGAGGGCGTCA CTGCCCGCCCGGCGGCGCCG GCGGGCAGCAGGGGGAGCCG ATCACCACCAGGTGGCGCCG AAGGACACTAGGTGGAGCCA TCGGCCGGCAGAGGGCGCTG ATGACCGCCAGGGGTCGCTC ACCACCAGCAGGGGGCACCT GCAGCCCGTGGGGGGCGCCG GTGGGCGGCAGGGGGCGCTG CCAGCCTCTAGGGGCCACTG TTGACCACCAGATGGTGGTA CCTGCCGAAAGGGGGCAGTG and so on }

foreach my $row(@col)
  {       
   ++$pwm{ substr $row, $_, 1 }[ $_ ] for 0 .. length( $row ) -1;  #holt die Teilstrings aus der Zeile, sprich Pos 1, Pos2....
  }
  @col=();  # benoetige leeres array fuer oben
  @$_ = map{ $_ ? ($_/$row_counter)+1 : 1 } @$_ for values %pwm;

 print "$_ @{ $pwm{$_}}\n" for sort keys %pwm;

1 个答案:

答案 0 :(得分:3)

这似乎可以满足您的需求,但我很惊讶您希望频率从1到2而不是从0到1

use strict;
use warnings 'all';

my @col = qw/
    GTGTCCATTAGAGGGCGCCA
    GCAGCCTCCTGAGGACGCCA
    GAGACCTCAAGGGGCCACTA
    GGGGCCACTAGGGGGCTCGA
    ATGGCCACAAGAGGGCGTCA
    CTGCCCGCCCGGCGGCGCCG
    GCGGGCAGCAGGGGGAGCCG
    ATCACCACCAGGTGGCGCCG
    AAGGACACTAGGTGGAGCCA
    TCGGCCGGCAGAGGGCGCTG
    ATGACCGCCAGGGGTCGCTC
    ACCACCAGCAGGGGGCACCT
    GCAGCCCGTGGGGGGCGCCG
    GTGGGCGGCAGGGGGCGCTG
    CCAGCCTCTAGGGGCCACTG
    TTGACCACCAGATGGTGGTA
    CCTGCCGAAAGGGGGCAGTG
/;

my %pwm;

for ( @col ) {

    my @row = split //;  #/

    for my $i ( 0 .. $#row ) {
        my $k = $row[$i];
        ++$pwm{$k}[$i];
    }
}

for my $counts ( values %pwm ) {
    for my $count ( @$counts ) {
        $count = ( $count // 0) / @col + 1;
    }
}

my @keys = sort keys %pwm;

my $fmt = '%-5s ' x @keys . "\n";
printf $fmt, @keys;

$fmt = '%.3f ' x @keys . "\n";

for my $i ( 0 .. @col ) {
    printf $fmt, map { $pwm{$_}[$i] } @keys;
}

输出

A     C     G     T     
1.294 1.176 1.412 1.118 
1.118 1.412 1.059 1.412 
1.176 1.118 1.647 1.059 
1.294 1.059 1.588 1.059 
1.059 1.824 1.118 1.000 
1.000 2.000 1.000 1.000 
1.471 1.059 1.294 1.176 
1.059 1.588 1.294 1.059 
1.176 1.529 1.000 1.294 
1.824 1.059 1.059 1.059 
1.000 1.000 2.000 1.000 
1.294 1.000 1.706 1.000 
1.000 1.059 1.765 1.176 
1.000 1.000 2.000 1.000 
1.059 1.118 1.765 1.059 
1.118 1.824 1.000 1.059 
1.235 1.000 1.706 1.059 
1.000 1.824 1.118 1.059