我有一个超过1GB的大文件,我希望用它中的两个值解析它并创建一个数组引用的哈希值。
以下是该文件的示例:
ra_uuid: 592bbb0c-2c6b-11e8-8580-00e081ea0e98
cms_uuid: a4e6bffc-2c6a-11e8-a7cf-00e081ea0e8e
mpd_uuid: bf3fd34c-2c57-11e8-8bc5-00e081ea0e5c
amLeader: 0
numAssignments = 20909996
mpg=1 mrule=140 reg=7989 score=10625 rank=0 perc=100 mp_demand=40
mpg=2 mrule=140 reg=7989 score=10625 rank=0 perc=100 mp_demand=50
mpg=1 mrule=140 reg=7989 score=10625 rank=0 perc=100 mp_demand=100
mpg=2 mrule=140 reg=7989 score=10625 rank=0 perc=100 mp_demand=40
mpg=3 mrule=150 reg=7989 score=0 rank=0 perc=100 mp_demand=20
mpg=4 mrule=150 reg=7989 score=10625 rank=0 perc=100 mp_demand=40
mpg=3 mrule=150 reg=7989 score=0 rank=0 perc=100 mp_demand=20
mpg=4 mrule=150 reg=7989 score=10625 rank=0 perc=100 mp_demand=40
所以我希望将字段mrule
的所有值作为哈希的键以及与数组引用中的mp_demand
对应的所有值。
以下是我对上述样本的期望输出:
{
'140' => [40,50,100,40],
'150' => [20,40,20,40]
}
我的代码:
use strict;
use warnings;
use Data::Dumper qw( Dumper );
my @bigarray;
my %hash;
my $hash_ref;
my @column;
my $key;
my $value;
open(FILE, "<", "$RESULTS_FILE/$ASSIGNMENT_MESSAGE_OUTPUT") or die("Could not open $ASSIGNMENT_MESSAGE_OUTPUT to read");
while(my $data = <FILE>){
map {s/=/ /g;} $data;
@column = split(/\t/, $data);
print("the column is ". Dumper(\@column));
$key = $column[3];
$value = $column[13];
$hash{$key} = $value ;
}
$hash_ref = \%hash ;
push(@bigarray, $hash_ref);
print("the hash is ". Dumper($hash_ref));
print("the demand array is ". Dumper(\@bigarray));
它产生以下输出:
the column is $VAR1 = [
'ra_uuid: 592bbb0c-2c6b-11e8-8580-00e081ea0e98
'
];
Use of uninitialized value $key in hash element at a.pl line 19, <FILE> line 1.
the column is $VAR1 = [
'cms_uuid: a4e6bffc-2c6a-11e8-a7cf-00e081ea0e8e
'
];
Use of uninitialized value $key in hash element at a.pl line 19, <FILE> line 2.
the column is $VAR1 = [
'mpd_uuid: bf3fd34c-2c57-11e8-8bc5-00e081ea0e5c
'
];
Use of uninitialized value $key in hash element at a.pl line 19, <FILE> line 3.
the column is $VAR1 = [
'amLeader: 0
'
];
Use of uninitialized value $key in hash element at a.pl line 19, <FILE> line 4.
the column is $VAR1 = [
'numAssignments 20909996
'
];
Use of uninitialized value $key in hash element at a.pl line 19, <FILE> line 5.
the column is $VAR1 = [
'mpg 1 mrule 140 reg 7989 score 10625 rank 0 perc 100 mp_demand 40
'
];
Use of uninitialized value $key in hash element at a.pl line 19, <FILE> line 6.
the column is $VAR1 = [
'mpg 2 mrule 140 reg 7989 score 10625 rank 0 perc 100 mp_demand 50
'
];
Use of uninitialized value $key in hash element at a.pl line 19, <FILE> line 7.
the column is $VAR1 = [
'mpg 1 mrule 140 reg 7989 score 10625 rank 0 perc 100 mp_demand 100
'
];
Use of uninitialized value $key in hash element at a.pl line 19, <FILE> line 8.
the column is $VAR1 = [
'mpg 2 mrule 140 reg 7989 score 10625 rank 0 perc 100 mp_demand 40
'
];
Use of uninitialized value $key in hash element at a.pl line 19, <FILE> line 9.
the column is $VAR1 = [
'mpg 3 mrule 150 reg 7989 score 0 rank 0 perc 100 mp_demand 20
'
];
Use of uninitialized value $key in hash element at a.pl line 19, <FILE> line 10.
the column is $VAR1 = [
'mpg 4 mrule 150 reg 7989 score 10625 rank 0 perc 100 mp_demand 40
'
];
Use of uninitialized value $key in hash element at a.pl line 19, <FILE> line 11.
the column is $VAR1 = [
'mpg 3 mrule 150 reg 7989 score 0 rank 0 perc 100 mp_demand 20
'
];
Use of uninitialized value $key in hash element at a.pl line 19, <FILE> line 12.
the column is $VAR1 = [
'mpg 4 mrule 150 reg 7989 score 10625 rank 0 perc 100 mp_demand 40
'
];
Use of uninitialized value $key in hash element at a.pl line 19, <FILE> line 13.
the column is $VAR1 = [
'
'
];
Use of uninitialized value $key in hash element at a.pl line 19, <FILE> line 14.
the hash is $VAR1 = {
'' => undef
};
the demand array is $VAR1 = [
{
'' => undef
}
];
答案 0 :(得分:1)
use strict;
use warnings;
use Data::Dumper;
my %mp_demand_by_mrule;
while (<DATA>) {
next unless /mrule/;
my %record = split(/[=\s]+/);
push(@{$mp_demand_by_mrule{$record{mrule}}}, $record{mp_demand});
}
print Dumper(\%mp_demand_by_mrule);
__DATA__
ra_uuid: 592bbb0c-2c6b-11e8-8580-00e081ea0e98
cms_uuid: a4e6bffc-2c6a-11e8-a7cf-00e081ea0e8e
mpd_uuid: bf3fd34c-2c57-11e8-8bc5-00e081ea0e5c
amLeader: 0
numAssignments = 20909996
mpg=1 mrule=140 reg=7989 score=10625 rank=0 perc=100 mp_demand=40
mpg=2 mrule=140 reg=7989 score=10625 rank=0 perc=100 mp_demand=50
mpg=1 mrule=140 reg=7989 score=10625 rank=0 perc=100 mp_demand=100
mpg=2 mrule=140 reg=7989 score=10625 rank=0 perc=100 mp_demand=40
mpg=3 mrule=150 reg=7989 score=0 rank=0 perc=100 mp_demand=20
mpg=4 mrule=150 reg=7989 score=10625 rank=0 perc=100 mp_demand=40
mpg=3 mrule=150 reg=7989 score=0 rank=0 perc=100 mp_demand=20
mpg=4 mrule=150 reg=7989 score=10625 rank=0 perc=100 mp_demand=40
输出:
$VAR1 = {
'140' => [
'40',
'50',
'100',
'40'
],
'150' => [
'20',
'40',
'20',
'40'
]
};