我需要编写一个解析人口普查数据(http://pastebin.com/hNzke4V8)的perl脚本。
脚本需要解析数据,并且对于每个县,打印县名,每平方英里土地的人口(人口/土地面积),以及作为水的县的百分比(水域/(土地)区域+水域))。
最后,脚本需要打印县名和以下条件的值。
Highest population density
Lowest population density
Highest percentage of water
Lowest percentage of water
下面是输出结果的示例:
County Population/sq mile % Water
Adams County 307.2 88.1%
Asotin County 111.8 12.6%
[... etc ...]
Highest population density: Adams County, 9999 people/square mile
Lowest population density: Pierce County, 3 people/square mile
Highest percentage of water: Whitman County, 90.2% water
Lowest percentage of water: Skagit County, 3.6% water
这是我到目前为止所提出的(我对perl不是很熟悉):
#!/usr/bin/perl -w
use strict;
use warnings;
#initialize stuff
my %water;
my %popdensity
my @fields;
my $county;
my $lowest_pop_county=0;
my $lowest_pop=9999999;
my $highest_pop_county=0;
my $highest_pop=0;
my $lowest_water_county=0;
my $lowest_water = 1;
my $highest_water_county = 0;
my $highest_water = 0;
#parse input
while (<>)
{
next if /County Name/;
chomp;
@fields = split /,/;
$water{$fields[0]} = $fields[3] / ($fields[2] + $fields[3]);
$popdensity{$fields[0]} = $fields[1] / $fields[2];
foreach $county (keys %water %popdensity)
{
#print values
print "The percent water for $county is %.2f%%\n", 100 * $water{$county};
print "The population per square mile of land for $county is $popdensity{$county}\n";
#determine highest and lowest values
if ($highest_pop < $popdensity{$county})
{
$highest_pop = popdensity{$county};
$highest_pop_county = $county;
}
if ($lowest_pop > $popdensity{$county})
{
$lowest_pop = popdensity{$county};
$lowest_pop_county = $county;
}
if ($highest_water < $water{$county})
{
$highest_water = $water{$county};
$highest_water_county = $county;
}
if ($lowest_water > $water{$county})
{
$lowest_water = $water{$county};
$lowest_water_county = $county;
}
}
#print highest and lowest values
print "Highest population density: $highest_pop_county, $highest_pop\n"
print "Lowest population density: $lowest_pop_county, $lowest_pop\n"
print "Highest percentage of water: $highest_water_county, $highest_water\n"
print "Lowest percentage of water: $lowest_water_county, $lowest_water\n"
}
不幸的是,当我尝试运行脚本(perl -w script.txt census.txt)时,遇到以下错误:
Operator or semicolon missing before %popdensity at script.txt line 28.
Ambiguous use of % resolved as operator % at script.txt line 28.
syntax error at script.txt line 8, near "my "
Global symbol "@fields" requires explicit package name at script.txt line 8.
Global symbol "$lowest_pop_county" requires explicit package name at script.txt line 10.
Global symbol "$lowest_pop" requires explicit package name at script.txt line 11.
Global symbol "$highest_pop_county" requires explicit package name at script.txt line 12.
Global symbol "$highest_pop" requires explicit package name at script.txt line 13.
Global symbol "$lowest_water_county" requires explicit package name at script.txt line 14.
Global symbol "$lowest_water" requires explicit package name at script.txt line 15.
Global symbol "$highest_water_county" requires explicit package name at script.txt line 16.
Global symbol "$highest_water" requires explicit package name at script.txt line 17.
Global symbol "@fields" requires explicit package name at script.txt line 24.
Global symbol "@fields" requires explicit package name at script.txt line 25.
Global symbol "@fields" requires explicit package name at script.txt line 25.
Global symbol "@fields" requires explicit package name at script.txt line 25.
Global symbol "@fields" requires explicit package name at script.txt line 25.
Global symbol "@fields" requires explicit package name at script.txt line 26.
Global symbol "@fields" requires explicit package name at script.txt line 26.
Global symbol "@fields" requires explicit package name at script.txt line 26.
Type of arg 1 to keys must be hash (not modulus (%)) at script.txt line 29, near "popdensity)
我做错了什么?在此先感谢您的帮助。
答案 0 :(得分:1)
一堆语法错误。 while循环缺少分号和缺少}。 keys
只需要一个哈希值。但这不是问题,因为这两个哈希都有相同的密钥。
此版本至少编译:
#!/usr/local/ActivePerl-5.16/bin/perl
#!/usr/bin/perl -w
use strict;
use warnings;
#initialize stuff
my %water;
my %popdensity;
my @fields;
my $county;
my $lowest_pop_county=0;
my $lowest_pop=9999999;
my $highest_pop_county=0;
my $highest_pop=0;
my $lowest_water_county=0;
my $lowest_water = 1;
my $highest_water_county = 0;
my $highest_water = 0;
#parse input
while (<>)
{
next if /County Name/;
chomp;
@fields = split /,/;
$water{$fields[0]} = $fields[3] / ($fields[2] + $fields[3]);
$popdensity{$fields[0]} = $fields[1] / $fields[2];
foreach $county (keys %water)
{
#print values
print "The percent water for $county is %.2f%%\n", 100 * $water{$county};
print "The population per square mile of land for $county is $popdensity{$county}\n";
#determine highest and lowest values
if ($highest_pop < $popdensity{$county})
{
$highest_pop = $popdensity{$county};
$highest_pop_county = $county;
}
if ($lowest_pop > $popdensity{$county})
{
$lowest_pop = $popdensity{$county};
$lowest_pop_county = $county;
}
if ($highest_water < $water{$county})
{
$highest_water = $water{$county};
$highest_water_county = $county;
}
if ($lowest_water > $water{$county})
{
$lowest_water = $water{$county};
$lowest_water_county = $county;
}
}
} # while loop
#print highest and lowest values
print "Highest population density: $highest_pop_county, $highest_pop\n";
print "Lowest population density: $lowest_pop_county, $lowest_pop\n";
print "Highest percentage of water: $highest_water_county, $highest_water\n";
print "Lowest percentage of water: $lowest_water_county, $lowest_water\n";
答案 1 :(得分:1)
你可以(实际应该)通过跟踪最高/最低流行/水来消除foreach
循环。例如,如果新水%大于最后一个,则用新的替换最后一个。这样你总是拥有最高的水%。对其他三个值执行相同的操作。 foreach
效率非常低,因为您遍历每个新县的所有密钥。
您的变量使用很好,但我倾向于使用hash of arrays(HoA)来跟踪高/低信息。这是一个HoA结构:
my %hash = ('high_pop' => ['King County','912.87']);
您按$hash{high_pop}[0]
获取县名,按$hash{high_pop}[1]
获得人口。
鉴于上述情况,请考虑以下事项:
use strict;
use warnings;
my %hash;
$hash{high_pop}[1] = 0;
$hash{low_pop}[1] = 9999999;
$hash{high_water}[1] = 0;
$hash{low_water}[1] = 100;
print "County\tPopulation/sq mile\t% Water\n";
while (<>) {
next if $. == 1;
chomp;
my @fields = split /,/;
my $popSqMi = sprintf '%.2f', $fields[1] / $fields[2];
my $percntWat = sprintf '%.2f', ( $fields[3] / ( $fields[2] + $fields[3] ) ) * 100;
print "$fields[0]\t$popSqMi\t$percntWat\n";
if ( $popSqMi > $hash{high_pop}[1] ) {
$hash{high_pop}[0] = $fields[0];
$hash{high_pop}[1] = $popSqMi;
}
if ( $popSqMi < $hash{low_pop}[1] ) {
$hash{low_pop}[0] = $fields[0];
$hash{low_pop}[1] = $popSqMi;
}
if ( $percntWat > $hash{high_water}[1] ) {
$hash{high_water}[0] = $fields[0];
$hash{high_water}[1] = $percntWat;
}
if ( $percntWat < $hash{low_water}[1] ) {
$hash{low_water}[0] = $fields[0];
$hash{low_water}[1] = $percntWat;
}
}
print "\nHighest population density: $hash{high_pop}[0], $hash{high_pop}[1]\n";
print "Lowest population density: $hash{low_pop}[0], $hash{low_pop}[1]\n";
print "Highest percentage of water: $hash{high_water}[0], $hash{high_water}[1]\n";
print "Lowest percentage of water: $hash{low_water}[0], $hash{low_water}[1]\n";
希望这有帮助!