我正在逐字逐句地读取文件(其中文件包含单词行)并将每个单词存储到散列中。我想存储出现的次数以及找到该单词的哪一行(注意:我将根据单词本身对散列进行排序,如代码所示)
我有(unworking)(假设单词数组的单词存储正确,没有特殊字符,并且是小写的):
my %wordlist;
my $line = 0;
foreach my $word (@words) {
$line++;
if (exists $wordlist{$word}) {
$wordlist{$word} += 1;
$wordlist{$line} = $wordlist{$line} . ", $line";
}
else {
$wordlist{$word} = 1;
$wordlist{$line} = "$line";
}
}
后来我尝试在一个包含:
的循环中打印$ wordlist {$ line}作为字符串printf "%${length}s: %4d times, on lines %s\n", $key, $wordlist{$key}, $wordlist{$line};
运行时,我收到错误:
Use of uninitialized value in printf at ./wc.pl line 105, <FILE> line 20.
someWord: 2 time(s), line(s)
其中第20行是退出声明
答案 0 :(得分:0)
$wordlist{$line} # Line data for each line
应该是
$wordline{$word} # Line data for each word
在输出之前格式化输出通常是一种不好的做法。这也不例外。
if (exists $wordlist{$word}) {
++$wordlist{$word};
push @{ $wordline{$word} }, $line;
}
else {
++$wordlist{$word};
push @{ $wordline{$word} }, $line;
}
当然简化为
++$wordlist{$word};
push @{ $wordline{$word} }, $line;
在printf
中,您可以使用
join(', ', @{ $wordline{$word} })
但是$wordlist{$word}
只是@{ $wordline{$word} }
中元素的数量,所以它完全不需要。只需使用
0+@{ $wordline{$word} }
而不是
$wordlist{$word}
所以你最终得到了
use strict;
use warnings;
use List::Util qw( max );
my %wordlines;
while (<>) {
chomp;
push @{ $wordlines{$_} }, $.;
}
my $max_len_p1 = 1 + max map length, keys %wordlines;
my $max_count_len = max map length(0+@$_), values %wordlines;
my $format = "%-${max_len_p1}s %${max_count_len}d times, on lines %s\n";
for my $word (
sort { @{ $wordlines{$b} } <=> @{ $wordlines{$a} } || $a cmp $b }
keys %wordlines
) {
printf($format,
"$word:",
0+@{ $wordlines{$word} },
join(', ', @{ $wordlines{$word} }),
);
}
输入:
cat
house
stair
chari
stair
mouse
stool
cat
hat
输出:
cat: 2 times, on lines 1, 8
stair: 2 times, on lines 3, 5
chari: 1 times, on lines 4
hat: 1 times, on lines 9
house: 1 times, on lines 2
mouse: 1 times, on lines 6
stool: 1 times, on lines 7
答案 1 :(得分:0)
您可以尝试以下示例,它应该为您提供一个良好的基础来开始和修改。
use strict;
use warnings;
my @words = <>;
my %wordlist;
my $line = 0;
foreach my $word (@words) {
chomp($word);
push (@{$wordlist{$word}}, ++$line);
}
foreach my $word (keys %wordlist){
my $count = @{$wordlist{$word}};
my $lines = join (', ',@{$wordlist{$word}});
printf ("%-10s: %4d times, on lines %s\n", $word, $count, $lines);
}
此示例使用perls autovivification来动态创建数据结构(如果尚未定义)。本质上,它读取的每个单词都会将行号推送到散列中该单词键的数组。如果从未见过该单词,则autovivifaction将在哈希中创建密钥,并在哈希值中以类似方式创建数组。
然后对于输出我们可以得到单词,因为它的键,我们可以通过couting哈希值数组中存在的行号的数量得到它的次数,我们可以创建一个字符串使用join的行号。
然后我们可以用printf打印出这些值。
的单词列表cat
house
stair
chari
stair
mouse
stool
cat
hat
将产生
的输出mouse : 1 times, on lines 6
cat : 2 times, on lines 1, 8
hat : 1 times, on lines 9
stool : 1 times, on lines 7
chari : 1 times, on lines 4
stair : 2 times, on lines 3, 5
house : 1 times, on lines 2