如何使用Perl匹配句子中的连续单词?

时间:2011-12-13 09:27:39

标签: perl

是否有更好的方法来匹配除此方法之外的单词,我试图找到任何句子中出现的数组中的单词。

 my $count = 0;
 my @strings = (
    "i'm going to find the occurrence of two words going if possible",
    "i'm going to find the occurrence of two words if impossible",
    "to find a solution to this problem",
    "i will try my best for a way to match this problem"
 );
 @neurot = qw(going match possible);

 my $com_neu = '\b'.join('\b|\b', @neurot).'\b';

 foreach my $sentence (@string){

 @l = $sentence =~ /($com_neu)/gi; 

 foreach my $list (@l){ 
    if($list =~ m/\w['\w-]*/){
          print $list;
      $count++;
    }   
 }

 print $count;
 }

输出:

String 1: going going possible
String 2: going 
String 3:
String 4: match

请以更快的方式帮助我。

感谢。

3 个答案:

答案 0 :(得分:1)

另一种方法可能是使用哈希来匹配单词:

my %neurot_hash = map { lc($_) => 1 } qw(going match possible);

for my $sentence (@strings) {
    for my $found (grep { $neurot_hash{ lc($_) } } $sentence =~ /\w['\w-]*/gi) {
        print $found, " ";
    }
    print "\n";
}

对于您提供的数据,此方法的速度提高约7%。但请记住,数据集非常小,所以YMMV。

答案 1 :(得分:1)

'智能匹配'运营商怎么样?

foreach my $elem (@neurot){ if(/$elem/i ~~ @strings){ print "Found $elem\n"; } }

答案 2 :(得分:0)

与bvr答案相同,但可能更清晰

my %neurot_hash = map { lc($_) => 1 } qw(going match possible);

for my $sentence (@strings) {
    my @words = split /[^\w']/, $sentence; 
            #I am not sure if you want to take "i'm" as a separate word. 
            #Apparently, stackoverflow does not like '.

    my @found = grep { exists $neurot_hash{ lc($_) } } @words;
    print join (" ",  @found);
    print "\n";
}