用于计算文本文件中单词数的perl代码

时间:2014-07-15 06:19:31

标签: perl count line each words

我编写了一个代码,用于计算文本文件中的单词数以及行数。但现在我希望输出显示每个文件中的单词数。 例如,如果输入文件是

hello there- 2
i am one of those-5

我的代码到现在为止

open FILE, "<editnlp.txt" or die "Cannot read $filename: $!\n";

$lines   = 0;
$words   = 0;
$letters = 0;

while ( $line = <FILE> ) {
    @words = split( " ", $line );

    $nwords = @words;

    for ( $i = 0 ; $i < $nwords ; $i = $i + 1 ) {
        @letters = split( "", $words[$i] );

        $nletters = @letters;

        $letters = $letters + $nletters;
    }

    $words = $words + $nwords;
    $lines = $lines + 1;
}

print "$filename contains $lines lines, $words words " . "and $letters letters.\n";

它可以很好地计算整个文本文件中的单词数量但是无法编辑它来计算每行中单词的总数。

3 个答案:

答案 0 :(得分:0)

#!/usr/bin/perl

sub linux_style_count {
        $filename = shift;
        $lines = `wc -l < $filename`;
        $words = `wc -w < $filename`;
        $chars = `wc -c < $filename`;

        print "\nFile $filename contains";
        print "\nNumber of Lines = $lines";
        print "Number of Words = $words";
        print "Number of Chars = $chars\n";
}

sub perl_style_count {
        $filename = shift;
        open(FILE, "<$filename") or die "Could not open file: $!";

        my ($lines, $words, $chars) = (0,0,0);

        while (<FILE>) {
            $lines++;
            $chars += length($_);
            $words += scalar(split(/\s+/, $_));
        }

        print "\nFile $filename contains";
        print "\nNumber of Lines = " . $lines;
        print "\nNumber of Words = " . $words;
        print "\nNumber of Chars = " . $chars . "\n";
}

@files = <*>;
foreach $file (@files) {
        if (-f $file) {
            &linux_style_count($file);
            &perl_style_count($file);
        }
}

<强>输出

[root@localhost /]# perl count.pl

File mytext.txt contains
Number of Lines = 9
Number of Words = 22
Number of Chars = 214


File mytext.txt contains
Number of Lines = 9
Number of Words = 22
Number of Chars = 214

File hello.txt contains
Number of Lines = 41
Number of Words = 127
Number of Chars = 888


File hello.txt contains
Number of Lines = 41
Number of Words = 127
Number of Chars = 888

File config.ini contains
Number of Lines = 32
Number of Words = 58
Number of Chars = 538


File config.ini contains
Number of Lines = 32
Number of Words = 58
Number of Chars = 538

答案 1 :(得分:0)

或者使用“wc”,它的功能与unix命令的功能相同。

请参阅This

答案 2 :(得分:0)

您可以尝试使用此代码。它将打印每行的单词数,最后在完成整个文件后,它将打印文件中的单词总数和总行数。

# Counting number of words in a files
open(DATA, "+<file1.txt") or die    "Couldn't open file file1.txt, $!";

$lines   = 0;
$words   = 0;
$nwords  =  0;
$total = 0;

print "\n";
while ( $line = <DATA> ) {
    $lines=$lines+1;
    @words = split( " ", $line );
    $nwords = @words;
    print "Number of words on the line $lines are : $nwords \n";
    $total = $total+$nwords;
}

print "\nTotal no. of words in file are $total \n";
print "\nTotal no. of lines in file are $lines \n";

输出将如下所示。

Number of words on the line 1 are : 10
Number of words on the line 2 are : 5
Number of words on the line 3 are : 0
Number of words on the line 4 are : 8
Number of words on the line 5 are : 0
Number of words on the line 6 are : 0
Number of words on the line 7 are : 10
Number of words on the line 8 are : 0
Number of words on the line 9 are : 0
Number of words on the line 10 are : 0
Number of words on the line 11 are : 0
Number of words on the line 12 are : 0
Number of words on the line 13 are : 10
Number of words on the line 14 are : 7
Number of words on the line 15 are : 0
Number of words on the line 16 are : 8

Total no. of words in file are 58

Total no. of lines in file are 16