在Perl - RegExp中的UpperCase中抑制超过3个单词的行

时间:2014-11-29 18:47:32

标签: regex perl

我需要创建一个Perl脚本,以便在UpperCase中抑制超过3个单词的所有行(每个单词用空格分隔)。

现在,我删除了UpperCase中的所有句子:

  while(my $text = <IN>)
  {
    $text =~ s/(^[A-Z \d\W]+$)\n//g;
  }

2 个答案:

答案 0 :(得分:3)

利用perlfaq4 - How can I count the number of occurrences of a substring within a string?计算模式的匹配数量。

然后应用过滤器:

use strict;
use warnings;

while (<DATA>) {
    my $uc_words = () = /\b[A-Z]{2,}\b/g;
    print if $uc_words < 3;

}
__DATA__
FIRST lower SECOND
FIRST lower SECOND and THIRD and end
FIRST and SECOND and just an I, is that enough?
Filter me because of FIRST, SECOND, THIRD, and FOURTH.
Just First Letter Capitalized Is Cool, Right?

输出:

FIRST lower SECOND
FIRST and SECOND and just an I, is that enough?
Just First Letter Capitalized Is Cool, Right?

答案 1 :(得分:0)

使用此模式

^(?=(.*\b[A-Z]+\b){3}).*(\n|$)  

Demo

^               # Start of string/line
(?=             # Look-Ahead
  (             # Capturing Group (1)
    .           # Any character except line break
    *           # (zero or more)(greedy)
    \b          # <word boundary>
    [A-Z]       # Character Class [A-Z]
    +           # (one or more)(greedy)
    \b          # <word boundary>
  )             # End of Capturing Group (1)
  {3}           # (repeated {3} times)
)               # End of Look-Ahead
.               # Any character except line break
*               # (zero or more)(greedy)
(               # Capturing Group (2)
  \n            # <new line>
  |             # OR
  $             # End of string/line
)               # End of Capturing Group (2)