在perl正则表达式中捕获4个字符的错误

时间:2014-03-05 07:03:02

标签: regex perl

学习perl,我试图捕获用户给定文件中的4个字符的单词,下面是带有while loop模式匹配的regex

代码段

while(<data>)
{
  $caps_string = $_; #assigning data to variable
  print "This is default string :\n $caps_string \n\n"; 

  $caps_string =~ tr/a-z/A-Z/;  #lower to upper case
  print "This is caps string :\n $caps_string \n\n"; 

  $caps_string =~ /\b[a-z]{4}\b/ig; #capturing 4 character words - which fails
  print "4 digit words in string are : \n $caps_string \n\n"; 

}

输出

This is default string :
 This is a text file data, coming from input.txt #correct

This is caps string :
 THIS IS A TEXT FILE DATA, COMING FROM INPUT.TXT #correct

4 digit words in string are : 
 THIS IS A TEXT FILE DATA, COMING FROM INPUT.TXT #incorrect according to me

最后一行的预期输出:

 #exact 4 character words
     THIS TEXT FILE DATA FROM

正在尝试的正则表达式模式和测试字符串显示输出为expected on regex101

在perl中使用时模式有什么问题,请指导!!

2 个答案:

答案 0 :(得分:3)

 #!/usr/local/bin/perl
 $caps_string = 'This is a text file data, coming from input.txt';
 print "This is default string :\n $caps_string \n\n";

 $caps_string =~ tr/a-z/A-Z/;  #lower to upper case
 print "This is caps string :\n $caps_string \n\n";

 ## You already converted string to upper case 
 ## So your pattern needs to match upper case letter .. so [A-Z] 
 ## And then you would want to store all the matches in an array   
 @matches = $caps_string =~ /\b[A-Z]{4}\b/g; #capturing 4 character words 
  print "4 digit words in string are : @matches \n";

我得到的输出:

This is default string :
 This is a text file data, coming from input.txt

This is caps string :
 THIS IS A TEXT FILE DATA, COMING FROM INPUT.TXT

4 digit words in string are : THIS TEXT FILE DATA FROM

答案 1 :(得分:2)

您需要使用()指定正则表达式捕获:

$caps_string =~ /\b([a-z]{4})\b/ig;  # Note the case-insensitive matching with /i

然后你最想要存储比赛:

my @fours = $caps_string =~ /\b([a-z]{4})\b/ig;  # 'THIS', 'TEXT', 'FILE', ...

print "@fours";   # "THIS TEXT FILE DATA FROM"