如何在正则表达式中引用匹配的部分

时间:2014-07-01 08:46:11

标签: regex perl

我使用以下代码搜索子字符串并在其前后用几个字符打印出来。不知何故,Perl使用$1向我提出问题并抱怨

  

在连接(。)或字符串中使用未初始化的值$1

我无法弄清楚为什么......你能吗?

use List::Util qw[min max];
my $word = "test";
my $lines = "this is just a test to find something out";
my $context = 3;
while ($lines =~ m/\b$word\b/g ) { # as long as pattern is found...
    print "$word\ ";
    print "$1";
    print substr ($lines, max(pos($lines)-length($1)-$context, 0), length($1)+$context); # check: am I possibly violating any boundaries here
}

5 个答案:

答案 0 :(得分:3)

您必须使用括号将$word捕获到正则表达式组$1中,

while ($lines =~ m/\b($word)\b/g)

答案 1 :(得分:1)

当您使用$1时,您要求代码使用正则表达式中第一个捕获的组,因为您的正则表达式没有任何,那么该变量不会存在。< / p>

您可以使用$&引用整个匹配项,也可以将捕获组添加到正则表达式并继续使用$1

即。之一:

use List::Util qw[min max];
my $word = "test";
my $lines = "this is just a test to find something out";
my $context = 3;
while ($lines =~ m/\b$word\b/g ) { # as long as pattern is found...
    print "$word\ ";
    print "$&";
    print substr ($lines, max(pos($lines)-length($&)-$context, 0), length($&)+$context); # check: am I possibly violating any boundaries here
}

或者

use List::Util qw[min max];
my $word = "test";
my $lines = "this is just a test to find something out";
my $context = 3;
while ($lines =~ m/(\b$word\b)/g ) { # as long as pattern is found...
    print "$word\ ";
    print "$1";
    print substr ($lines, max(pos($lines)-length($1)-$context, 0), length($1)+$context); # check: am I possibly violating any boundaries here
}

注意:此处使用(\b$word\b)(\b$word)\b\b($word\b)\b($word)\b并不重要,因为\b是&#39 ;字符串&#39;长度为0。

答案 2 :(得分:0)

如果要在正则表达式中处理匹配的部分,请将其放在括号中。您可以通过$1变量(对于第一对括号),$2(对于第二对)等来处理此数学部分,依此类推。

答案 3 :(得分:0)

$1$2等等保留捕获组找到的字符串。执行匹配时,所有这些变量都设置为undef。问题中的代码没有任何捕获组,因此$1永远不会给出值,它是未定义的。

运行下面的代码显示效果。最初$1$2$3未定义。第一场比赛设置$1$2,但不设置$3。第二个匹配仅设置$1,但不会将$2清除为未定义。第三场比赛没有捕捉组,所有三个都未定义。

use strict;
use warnings;

sub show
{
    printf "\$1: %s\n", (defined $1 ? $1 : "-undef-");
    printf "\$2: %s\n", (defined $2 ? $2 : "-undef-");
    printf "\$3: %s\n", (defined $3 ? $3 : "-undef-");
    print "\n";
}

my $text = "abcdefghij";
show();

$text =~ m/ab(cd)ef(gh)ij/;  # First match
show();

$text =~ m/ab(cd)efghij/;  # Second match
show();

$text =~ m/abcdefghij/;  # Third match
show();

答案 4 :(得分:0)

$1没有任何价值,除非您实际捕获了某些东西。

您可以将边界收集方法调整为使用前瞻和后瞻。

use strict;
use warnings;

my $lines = "this is just a test to find something out";
my $word = "test";
my $extra = 10;

while ($lines =~ m/(?:(?<=(.{$extra}))|(.{0,$extra}))\b(\Q$word\E)\b(?=(.{0,$extra}))/gs ) {
    my $pre = $1 // $2;
    my $word = $3;
    my $post = $4;
    print "'...$pre<$word>$post...'\n";
}

输出:

'...is just a <test> to find s...'