我有一个文件,其中包含Gujarat,West Bengal,Jammu& amp;克什米尔和D& D Haveli。我写了一个正则表达式来获得这样的名字。这些名称进入“年份”(2001年)的关键,这样它也可以从正则表达式中获得,使得年份是关键,同时各个州是同一个关键词的一部分。
my $stat;
my ($line, $year, $state_name, @state_name);
while($line = <FH>){
if($line =~ m/^Year (\d+)\S+/){
$year = $1;
$stat->{$year} = {};
next;
}
elsif ($line =~ m/^State:,(\w+\s\w+)/){
$state_name = $1;
$stat->{$year}{$state_name} = {};
next;
}
elsif ($line =~ m/^State:(\w+)/){
$state_name = $1;
$stat->{$year}{$state_name} = {};
next;
}
elsif ($line =~ m/^State:(\w&\w\s\w+)/){
$state_name = $1;
$stat->{$year}{$state_name} = {};
next;
}
elsif ($line =~ m/^State:(\w+\s&\s\w+)/){
$state_name = $1;
$stat->{$year}{$state_name} = {};
next;
}
}
print (Dumper(\$stat));
我想要打印这样的东西:
$VAR2 = {'2001' => {
'Gujarat'
'Jammu & Kashmir'
'West Bengal'
'D&D Haveli'
}
}
相反,只有West Bengal打印在带有密钥的散列中作为2001,其他的被省略。请问你能说明我哪里出错了。谢谢。
编辑后的文件如下:
Year 2001,,,,,,,,
State:,West Bengal,,,,,,,
Year 2001,,,,,,,,
State:,Gujarat,,,,,,,
Year 2001,,,,,,,,
State:,Jammu & Kashmir,,,,,,, and so on.
答案 0 :(得分:2)
代码:
if($line =~ m/^Year (\d+)\S+/){
$year = $1;
$stat->{$year} = {};
next;
}
如果存在,则会覆盖$stat->{$year}
下的结构,因为您的“2001”年值不止一次出现
快速修复:
if($line =~ m/^Year (\d+)\S+/){
$year = $1;
if (not defined $stat->{$year}) {
$stat->{$year} = {};
}
next;
}