仅在匹配模式存在时才匹配

时间:2013-01-04 21:58:03

标签: regex perl

我有以下代码:

#!/usr/bin/perl
use warnings;
use strict;

my $SourceStr='Foo - Name: Rob  Time: 11/2/2011 13:47:30  State: Prelim 3  Optional: Some stuff here';
#my $SourceStr='Foo - Name: Rob  Time: 11/2/2011 13:47:30  State: Prelim 3';

my $RegEx = qr/Name: (.+)  Time: (.+)  State: (.+)  Optional: (.+?)(  |$)/;

if ($SourceStr =~ m/$RegEx/) {
   print "1=[$1]\n";
   print "2=[$2]\n";
   print "3=[$3]\n";
   print "4=[$4]\n";
}

当使用第一个$ SourceStr运行时,它按预期工作。但是,对于被注释掉的第二个,有没有办法让$ 4填充空字符串?

第一个字符串结果:

1=[Rob]
2=[11/2/2011 1:47:30 PM]
3=[3]
4=[Some stuff here]

第二个字符串结果:不匹配

想要:

1=[Rob]
2=[11/2/2011 1:47:30 PM]
3=[3]
4=[]

5 个答案:

答案 0 :(得分:2)

您可以使用更具体的正则表达式:

#!/usr/bin/perl
use warnings;
use strict;

my @SourceStrA=('Foo - Name: Rob  Time: 11/2/2011 13:47:30  State: 3  Optional: Some stuff here',
                'Foo - Name: Rob  Time: 11/2/2011 13:47:30  State: 3');

my $RegEx = qr!Name:\s*(\w+)\s*Time:\s*([\d/]*\s*[\d:]*)\s*State:\s*(\d+)\s*(?:Optional:\s*(.*))?!;

for my $SourceStr (@SourceStrA) {
  print "$SourceStr\n";
  if ($SourceStr =~ m/$RegEx/) {
    print "1=[$1]\n";
    print "2=[$2]\n";
    print "3=[$3]\n";
    print "4=[$4]\n" if defined $4; 
  }
}

输出:

Foo - Name: Rob  Time: 11/2/2011 13:47:30  State: 3  Optional: Some stuff here
1=[Rob]
2=[11/2/2011 13:47:30]
3=[3]
4=[Some stuff here]
Foo - Name: Rob  Time: 11/2/2011 13:47:30  State: 3
1=[Rob]
2=[11/2/2011 13:47:30]
3=[3]

答案 1 :(得分:1)

也许你应该使用哈希或其他东西。

#!/usr/bin/perl
use warnings;
use strict;

#my $SourceStr='Foo - Name: Rob  Time: 11/2/2011 13:47:30  State: 3  Optional: Some stuff here';
my $SourceStr='Foo - Name: Rob  Time: 11/2/2011 13:47:30  State: 3';

my %Values;

while ($SourceStr =~ m/(\w+): (.+?)(?:  |$)/g) {
    $Values{$1} = $2;
}

if ($Values{Name} && $Values{Time} && $Values{State}) {
    print "1=$Values{Name}\n";
    print "2=$Values{Time}\n";
    print "3=$Values{State}\n";

    if (defined $Values{Optional}) {
        print "4=$Values{Optional}\n";
    } else {
        print "4=\n";
    }
}

答案 2 :(得分:1)

请求看起来很奇怪,但这是一个解决方案:

#!/usr/bin/perl
use warnings;
use strict;

my $SourceStr='Foo - Name: Rob  Time: 11/2/2011 13:47:30  State: 3  Optional: Some stuff here';
#my $SourceStr='Foo - Name: Rob  Time: 11/2/2011 13:47:30  State: 3';

my $RegEx = qr/Name: (.+)  Time: (.+)  State: (.+?)(?:  Optional: )?(.*)(  |$)/;

if ($SourceStr =~ m/$RegEx/) {
   print "1=[$1]\n";
   print "2=[$2]\n";
   print "3=[$3]\n";
   print "4=[$4]\n";
}

当然,诀窍是使用(?: )语法来创建一个额外的组而不更改$ 4的位置。此外,使用(?: Optional: (.*))?是不正确的(尽管更合乎逻辑且更健壮),因为它将暗示$ 4将是未定义的(并且您需要它是一个空字符串),并且use strict pragma正在打印令人不安的Use of uninitialized value...消息。

无论如何,这些要求看起来更像是一个练习而不是现实生活中的问题,不是吗?

答案 3 :(得分:1)

这是一个产生所需结果的选项:

#!/usr/bin/perl
use warnings;
use strict;

my $SourceStr = 'Foo - Name: Rob  Time: 11/2/2011 13:47:30  State: 3  Optional: Some stuff here';

#my $SourceStr = 'Foo - Name: Rob  Time: 11/2/2011 13:47:30  State: 3';

my $RegEx = qr/Name: (.+)  Time: (.+)  State: (.+?)(?:\s+Optional: (.+))?$/;

if ( $SourceStr =~ $RegEx ) {
    print "1=[$1]\n";
    print "2=[$2]\n";
    print "3=[$3]\n";
    print '4=[' . ( $4 // '' ) . "]\n";
}

答案 4 :(得分:1)

如记录here所述,通过命名捕获而不是编号来处理可选匹配可能更容易。

#!/usr/bin/env perl

use warnings;
use strict;

my @SourceStr = (
  'Foo - Name: Rob  Time: 11/2/2011 13:47:30  State: 3  Optional: Some stuff here',
  'Foo - Name: Rob  Time: 11/2/2011 13:47:30  State: 3',
);

my $RegEx = qr/Name: (?<name>.+?)  Time: (?<time>.+?)  State: (?<state>.+?)(?:  Optional: (?<optional>.+?))?(  |$)/;

foreach (@SourceStr) {
  print "Input '$_'\n";
  if ( /$RegEx/ ) {
     print "Name = '$+{name}'\n";
     print "Time = '$+{time}'\n";
     print "State = '$+{state}'\n";
     print "Optional = '$+{optional}'\n" if $+{optional};
  }
  print "\n";
}

实际上它使它变得如此简单,只是转储%+哈希几乎更容易:

#!/usr/bin/env perl

use warnings;
use strict;

my @SourceStr = (
  'Foo - Name: Rob  Time: 11/2/2011 13:47:30  State: 3  Optional: Some stuff here',
  'Foo - Name: Rob  Time: 11/2/2011 13:47:30  State: 3',
);

my $RegEx = qr/Name: (?<name>.+?)  Time: (?<time>.+?)  State: (?<state>.+?)(?:  Optional: (?<optional>.+?))?(  |$)/;

use Data::Dumper;
foreach (@SourceStr) {
  print "Input '$_'\n";
  print Dumper \%+ if /$RegEx/;
}