Question

我正在编写一个Perl脚本来查找消息中字符出现的频率。这是我遵循的逻辑：

使用getc（）从消息中一次读取一个字符并将其存储到数组中。
运行for循环，从索引0开始到此数组的长度。
此循环将读取数组的每个char并将其分配给临时变量。
运行另一个嵌套在上面的for循环，它将从被测字符的索引运行到数组的长度。
使用此字符与当前数组索引字符之间的字符串比较，如果计数器相等，则计数器会递增。
完成内部For循环后，我打印char的频率以进行调试。

问题：我不希望程序重新计算字符的频率（如果已经计算过）。例如，如果字符“a”出现3次，则对于第一次运行，它会计算正确的频率。但是，在下一次出现“a”时，由于循环从该索引运行到结束，因此频率为（实际频率-1）。类似于第三次出现，频率是（实际频率-2）。

解决这个问题。我使用了另一个临时数组，我将推送已经评估其频率的字符。

然后在下一次for循环运行时，在进入内部for循环之前，我将当前char与已计算的chars数组进行比较并设置一个标志。基于该标志，内部for循环运行。

这对我不起作用。结果仍然相同。

这是我为完成上述内容而编写的代码：

#!/usr/bin/perl

use strict;
use warnings;

my $input=$ARGV[0];
my ($c,$ch,$flag,$s,@arr,@temp);

open(INPUT,"<$input");

while(defined($c = getc(INPUT)))
{
push(@arr,$c);
}

close(INPUT);

my $length=$#arr+1;

for(my $i=0;$i<$length;$i++)
{
$count=0;
$flag=0;
$ch=$arr[$i];
foreach $s (@temp)
{
    if($ch eq $s)
    {
        $flag = 1;
    }
}
if($flag == 0)
{
for(my $k=$i;$k<$length;$k++)
{
    if($ch eq $arr[$k])
    {
        $count = $count+1;
    }
}
push(@temp,$ch);
print "The character \"".$ch."\" appears ".$count." number of times in the         message"."\n";
}
}

Answer 1

你的生活比你需要的更艰难。使用哈希：

my %freq;

while(defined($c = getc(INPUT)))
{
  $freq{$c}++;
}

print $_, " ", $freq{$_}, "\n" for sort keys %freq;

$freq{$c}++会增加$freq{$c}中存储的值。（如果它未设置或为零，则变为一个。）

打印行相当于：

foreach my $key (sort keys %freq) {
  print $key, " ", $freq{$key}, "\n";
}

Answer 2

如果要对整个文件执行单个字符计数，请使用其他方法发布的任何建议方法。如果您想要计算 all 的出现次数对于文件中的每个字符，我建议：

#!/usr/bin/perl

use strict;
use warnings;

# read in the contents of the file
my $contents;
open(TMP, "<$ARGV[0]") or die ("Failed to open $ARGV[0]: $!");
{
    local($/) = undef;
    $contents = <TMP>;
}
close(TMP);

# split the contents around each character
my @bits = split(//, $contents);

# build the hash of each character with it's respective count
my %counts = map { 
    # use lc($_) to make the search case-insensitive
    my $foo = $_; 

    # filter out newlines
    $_ ne "\n" ? 
        ($foo => scalar grep {$_ eq $foo} @bits) :
        () } @bits;

# reverse sort (highest first) the hash values and print
foreach(reverse sort {$counts{$a} <=> $counts{$b}} keys %counts) {
    print "$_: $counts{$_}\n";
}

Answer 3

更快的解决方案：

@result = $subject =~ m/a/g; #subject is your file

print "Found : ", scalar @result, " a characters in file!\n";

当然，您可以将变量放在'a'的位置，或者甚至更好地执行此行，以用于计算出现次数的任何字符。

Answer 4

我不明白你要解决的问题，所以我提出了一种更简单的方法来计算字符串中的字符：

$string = "fooooooobar";
$char = 'o';
$count = grep {$_ eq $char} split //, $string;
print $count, "\n";

这将在$ string（7）中打印$ char出现次数。希望这有助于编写更紧凑的代码

Answer 5

作为一个单行：

perl -F"" -anE '$h{$_}++ for @F; END { say "$_ : $h{$_}" for keys %h }' foo.txt

使用Perl计算消息中的字符频率

5 个答案: