如何拆分字符串和捕获以正则表达式结尾的句子?

时间:2017-02-19 13:26:20

标签: regex perl

我想分割一个字符串并捕获结尾字符的句子,例如.?!

换句话说,我的正则表达式基于空格和特殊字符分隔字符串,英语句子使用end .?!,但它应该保留这些字符。

我知道这有点令人困惑所以请看下面的数组,以防万一 像这样的句子

why you are eating too much?

存储这些单词的数组应该是这样的

@word = ( "why", "you", "are", "eating", "too", "much", "?" );

但我的代码输出数组是这样的

@word=("why"," ","you","are","eating","too"," ","much","?","?");

代码:

my $s = "why you are eating too much?";

my @word = split /(\s+|([\s+.?!]))/, $s;

for ( @word ){
    print "$_\n";
} 

2 个答案:

答案 0 :(得分:1)

如果您知道要丢弃的内容,请使用split 如果您知道要保留的内容,请在列表上下文中使用m//g

这看起来像是后者的情况:

my $str = "why are you eating too much?";
my @words = $str =~ m/[^\s.!?]+|[.!?]/g;

答案 1 :(得分:0)

您可以使用以下正则表达式而不是split()

(\w+|[\.!?])

以下是Perl中的示例代码和live example

use Data::Dumper;

my $str = "why you are eating too much?";
my @matches = $str =~ /(\w+|[\.!?])/g;
print Dumper \@matches;