我想分割一个字符串并捕获结尾字符的句子,例如.
,?
,!
。
换句话说,我的正则表达式基于空格和特殊字符分隔字符串,英语句子使用end .
,?
,!
,但它应该保留这些字符。
我知道这有点令人困惑所以请看下面的数组,以防万一 像这样的句子
why you are eating too much?
存储这些单词的数组应该是这样的
@word = ( "why", "you", "are", "eating", "too", "much", "?" );
但我的代码输出数组是这样的
@word=("why"," ","you","are","eating","too"," ","much","?","?");
代码:
my $s = "why you are eating too much?";
my @word = split /(\s+|([\s+.?!]))/, $s;
for ( @word ){
print "$_\n";
}
答案 0 :(得分:1)
如果您知道要丢弃的内容,请使用split
如果您知道要保留的内容,请在列表上下文中使用m//g
。
这看起来像是后者的情况:
my $str = "why are you eating too much?";
my @words = $str =~ m/[^\s.!?]+|[.!?]/g;
答案 1 :(得分:0)
您可以使用以下正则表达式而不是split()
:
(\w+|[\.!?])
以下是Perl中的示例代码和live example:
use Data::Dumper;
my $str = "why you are eating too much?";
my @matches = $str =~ /(\w+|[\.!?])/g;
print Dumper \@matches;