Perl字符串模式匹配的负正则表达式

时间:2011-06-15 16:50:13

标签: regex perl

我有这个正则表达式:

if($string =~ m/^(Clinton|[^Bush]|Reagan)/i)
  {print "$string\n"};

我想和克林顿和里根相提并论,但不是布什。

它不起作用。

5 个答案:

答案 0 :(得分:129)

你的正则表达式不起作用,因为[]定义了一个字符类,但你想要的是一个先行:

(?=) - Positive look ahead assertion foo(?=bar) matches foo when followed by bar
(?!) - Negative look ahead assertion foo(?!bar) matches foo when not followed by bar
(?<=) - Positive look behind assertion (?<=foo)bar matches bar when preceded by foo
(?<!) - Negative look behind assertion (?<!foo)bar matches bar when NOT preceded by foo
(?>) - Once-only subpatterns (?>\d+)bar Performance enhancing when bar not present
(?(x)) - Conditional subpatterns
(?(3)foo|fu)bar - Matches foo if 3rd subpattern has matched, fu if not
(?#) - Comment (?# Pattern does x y or z)

所以试试:(?!bush)

答案 1 :(得分:27)

示例文字:

  克林顿说   布什使用蜡笔   里根忘记了

省略布什比赛:

$ perl -ne 'print if /^(Clinton|Reagan)/' textfile
Clinton said
Reagan forgot

或者,如果你真的想指明:

$ perl -ne 'print if /^(?!Bush)(Clinton|Reagan)/' textfile
Clinton said
Reagan forgot

答案 2 :(得分:16)

你的正则表达式如下:

/^         - if the line starts with
(          - start a capture group
Clinton|   - "Clinton" 
|          - or
[^Bush]    - Any single character except "B", "u", "s" or "h"
|          - or
Reagan)   - "Reagan". End capture group.
/i         - Make matches case-insensitive 

所以,换句话说,正则表达式的中间部分正在搞砸你。由于它是一种“全能型”群体,它将允许任何不以“布什”中的任何大写或小写字母开头的行。例如,这些行符合您的正则表达式:

Our president, George Bush
In the news today, pigs can fly
012-3123 33

如前所述,您要么做出否定的预测,要么只是制作两个正则表达式:

if( ($string =~ m/^(Clinton|Reagan)/i) and
    ($string !~ m/^Bush/i) ) {
   print "$string\n";
}

正如mirod在评论中指出的那样,当使用插入符号(^)仅匹配行的开头时,第二次检查是非常不必要的,因为行以“Clinton”或“里根“永远不会以”布什“开始。

然而,如果没有插入符号,它将是有效的。

答案 3 :(得分:3)

使用两个正则表达式(或三个)有什么问题?这使您的意图更加清晰,甚至可以提高您的表现:

if ($string =~ /^(Clinton|Reagan)/i && $string !~ /Bush/i) { ... }

if (($string =~ /^Clinton/i || $string =~ /^Reagan/i)
        && $string !~ /Bush/i) {
    print "$string\n"
}

答案 4 :(得分:2)

如果我的理解是正确的,那么你想要以任何顺序匹配克林顿和里根的任何一条线,而不是布什。正如Stuck所建议的,这是一个带有先行断言的版本:

#!/usr/bin/perl

use strict;
use warnings;

my $regex = qr/
    (?=.*clinton)  
    (?!.*bush) 
    .*reagan       
    /ix;

while (<DATA>) {
    chomp;
    next unless (/$regex/);
    print $_, "\n";
}


__DATA__
shouldn't match - reagan came first, then clinton, finally bush
first match - first two: reagan and clinton
second match - first two reverse: clinton and reagan
shouldn't match - last two: clinton and bush
shouldn't match - reverse: bush and clinton
shouldn't match - and then came obama, along comes mary
shouldn't match - to clinton with perl

结果

first match - first two: reagan and clinton
second match - first two reverse: clinton and reagan

根据需要,它可以按任何顺序匹配任何有里根和克林顿的线。

您可能想尝试阅读前瞻断言如何与http://www252.pair.com/comdog/mastering_perl/Chapters/02.advanced_regular_expressions.html

中的示例一起使用 他们非常好吃:)。