在文本文件Perl中打印单词频率

时间:2014-11-06 04:57:25

标签: perl scripting scripting-language

我正在尝试打印行数,字数,字符数,并打印出文件中的单词以及它们出现的次数。我在最后一部分遇到错误(即打印出来的单词和它们的出现)。其他一切都很好。

我得到的错误消息:

Bareword found where operator expected at wc.pl line 34, near ""Number of lines: $lcnt\","Frequency"
        (Missing operator before Frequency?)
syntax error at wc.pl line 34, near ""Number of lines: $lcnt\","Frequency of "
Can't find string terminator '"' anywhere before EOF at wc.pl line 34.

这是我的代码:

#!/usr/bin/perl -w

use warnings;
use strict;


my $lcnt = 0;
my $wcnt = 0;
my $ccnt = 0;
my %count;
my $word;
my $count;

open my $INFILE, '<', $ARGV[0] or die $!;

while( my $line = <$INFILE> ) {

$lcnt++;

$ccnt += length($line);

my @words = split(/\s+/, $line);

$wcnt += scalar(@words);

        foreach $count(@words) {
            $count{@words}++;
        }
}

foreach $word (sort keys %count) {


print "Number of characters: $ccnt\n","Number of words: $wcnt\n","Number of lines: $lcnt\","Frequency of words in the file: $word : $count{$word}";

}

close $INFILE;

这就是我需要做的事情:

来自txt文件的示例输入:

This is a test, another test
#test# 234test test234

示例输出:

Number of characters: 52
Number of words: 9
Number of lines: 2
Frequency of words in the file:
--------------------------------
#test#: 1
234test: 1
This: 1
a: 1
another: 1
is: 1
test: 1
test,: 1
test234: 1

非常感谢任何帮助!

2 个答案:

答案 0 :(得分:2)

您的代码中存在一些逻辑错误和一些变量错误。对于逻辑错误,您实际上只需要打印一次“字符数”,但是将它放在一个循环中,还有一些应该只打印一次的其他字符。将它们拉出循环。

接下来,你没有正确计算;你从来没有真正使用foreach $count (@words)行中的单词。这就是我所说的可变误用; “$count{@words}++”绝对不是你想要的。

也有一个拼写错误,导致Perl发出语法错误。这是n中遗失的\n。一个简单的解决方案。

最后,我们将尝试更好地在尽可能最窄的范围内声明变量。以下是它的外观:

my $lcnt = 0;
my $wcnt = 0;
my $ccnt = 0;
my %count;

while( my $line = <DATA> ) {

    $lcnt++;
    $ccnt += length($line);

    my @words = split(/\s+/, $line);
    $wcnt += scalar(@words);

    foreach my $word (@words) {
        $count{$word}++;
    }
}

print "Number of characters: $ccnt\n",
      "Number of words: $wcnt\n",
      "Number of lines: $lcnt\n",
      "Frequency of words in the file:\n",
      "-----------------------------------\n";

foreach my $word (sort keys %count) {
    print "$word: $count{$word}\n";
}

__DATA__
This is a test, another test
#test# 234test test234

为了简单起见,我现在切换到使用__DATA__文件句柄。您可以轻松切换回打开输入文件。

答案 1 :(得分:1)

看起来你打算做一个\ n而是做了一个\“,它逃脱了字符串引用的结尾。

改变自己;

... "Number of lines: $lcnt\","Frequency of ...

要;

... "Number of lines: $lcnt\n","Frequency of ...