如何在perl中逐句阅读文本文件?

时间:2017-03-12 23:31:57

标签: perl

我想逐句阅读文本文件。我的问题是下面的代码只根据句点分开。

#!/usr/bin/perl

use strict;
use warnings;

my $file = "data.txt";
open (FILE , $file);
my @buffer;
$/ = '.';  
while ( my $sentence = <FILE> ) {
#do_something
}
close FILE;

无论如何都要使$/像这样/[.?!]/采用正则表达式,因此它会根据问号或感叹号分隔句子而不仅仅是句点

1 个答案:

答案 0 :(得分:1)

使用Lingua::Sentence

可以更好地完成此操作
use feature qw(say);
use strict;
use warnings;
use Lingua::Sentence;

my $fn = "data.txt";
open (my $fh, '<', $fn ) or die "Could not open file '$fn': $!";
my $str = do {local $/; <$fh>};
close $fh;

for my $sentence (Lingua::Sentence->new("en")->split_array( $str)) {
    say $sentence;
}

使用 data.txt

'How often do you come here?', asked Mr. Smith.
This is a paragraph. It contains several sentences. "But why," you ask?

我们得到以下输出:

'How often do you come here?', asked Mr. Smith.
This is a paragraph.
It contains several sentences.
"But why," you ask?