这个正则表达式的事情正在变老。 :( 还有一个问题: 我需要计算一个段落中的单词数和句子数。我尝试使用的代码是:
my $sentencecount = $file =~ s/((^|\s)\S).*?(\.|\?|\!)/$1/g;
my $count = $file =~ s/((^|\s)\S)/$2/g;
print "Input file $ARGV[1] contains $sentencecount sentences and $count words.";
我的结果为两个计数返回63。我知道这是不正确的,至少就字数而言。这是使用替换计数过程的结果吗?如果是这样,我该如何纠正?
答案 0 :(得分:2)
我建议查看perl split
函数,请参阅perlfunc(1)
:
If EXPR is omitted, splits the $_ string. If PATTERN is also
omitted, splits on whitespace (after skipping any leading
whitespace). Anything matching PATTERN is taken to be a
delimiter separating the fields. (Note that the delimiter may
be longer than one character.)
答案 1 :(得分:1)
my $wordCount = 0;
++$wordCount while $file =~ /\S+/g;
my $sentenceCount = 0;
++$sentenceCount while $file =~ /[.!?]+/g;
在标量上下文中进行//g
匹配,因为我们在这里避免构建所有单词或所有句子的庞大列表,如果文件很大则节省内存。句子计数代码将任意数量的句末分隔符计为单个句子(例如Hello... world!
将被计为2个句子。)
答案 2 :(得分:0)
这可以从$file
$file="This is praveen worki67ng in RL websolutions";
my $count = () = $file =~ /\S+/g;
my $counter = () = $file =~ /\S/g;