Question

我有这种格式的数据

我想计算第一列中每个实体的累积分数（第3列）。所以我试着制作一个哈希，我的代码看起来像这样：

use strict;
use warnings;

use Data::Dumper;

my $file = shift;
open (DATA, $file);

my %hash;
while ( my $line = <DATA> ) {
  chomp $line;
  my ($protein, $year, $score) = split /\s+/, $line;
  push @{ $hash{$protein}{$year} }, $score;
}

print Dumper \%hash;

close DATA:

输出看起来像这样

$VAR1 = {
          'a3' => {
                    '1902' => [
                                5
                              ]
                  },
          'a1' => {
                    '1902' => [
                                6
                              ],
                    '1901' => [
                                4
                              ]
                  },
          'a4' => {
                    '1903' => [
                                8
                              ],
                    '1902' => [
                                7
                              ]
                  },
          'a5' => {
                    '1903' => [
                                9
                              ]
                  }
        };

我现在想要访问第1列（a1，a2，a3）中的每个实体并添加分数，因此所需的输出将是这样的：

a1 1901 4
a1 1902 9    # 4+5
a3 1902 6
a4 1902 7
a4 1903 16   # 7+9
a5 1903 9

但我无法想出如何在循环中访问创建的哈希值以添加值？

Answer 1

我认为

a4 1903 16   # Sum of a4 1902 and a5 1903

应该是

a4 1903 15   # Sum of a4 1902 and a4 1903

如果是的话，

my %scores_by_protein_and_year;
while (<DATA>) {
   my ($protein, $year, $score) = split;
   $scores_by_protein_and_year{$protein}{$year} = $score;
}

for my $protein (keys(%scores_by_protein_and_year)) {
   my $scores_by_year = $scores_by_protein_and_year{$protein};
   my $score = 0;
   for my $year (sort { $a <=> $b } keys(%$scores_by_year)) {
      $score += $scores_by_year->{$year};
      say "$protein $year $score";
   }
}

即使数据没有分组/排序，这也有效。

Answer 2

如果数据总是在您显示时进行排序，那么您可以在从文件中读取数据时处理数据：

while ( <DATA> ) {
    my ($protein, $year, $score) = split;

    $total = 0 unless $protein eq $current;
    $total += $score;

    print "$protein $year $total\n";

    $current = $protein;
}

输出

如何在循环中访问数组的嵌套哈希？

2 个答案:

输出