在Perl程序中轻松访问模式匹配的捕获数组

时间:2014-09-20 14:22:04

标签: arrays regex perl

我的问题是:如何在Perl程序中轻松访问模式匹配的数组? (我知道有解决方案涉及split或/.../g,但我特别要求一种简单的方法来访问变量$ 1,$ 2,$ 3,......

(我认为应该有一个数组,类似于@ - 和@ +,但我找不到它)

这是我到目前为止(解决方案A涉及substr($ line,$ - [$ ],$ + [$ ] - $ - [$ ])和解决方案B涉及eval“\ $$ ”),但我宁愿直接将变量$ 1,$ 2,$ 3作为数组访问:

use strict;
use warnings;

my $line = (join '', map { chr($_ + 64) } 1..26) x 10;

my $rstr = '';
$rstr .= '('.('.' x (rand(3) + 2)).')' for 1..rand(15) + 3;

unless ($line =~ m{\A $rstr}xms) {
    die "No match";
}

print $rstr, "\n";

for (1..$#-) {
    printf "A> %3d. -> pos%3d -%3d = '%s'\n", $_,
      $-[$_], $+[$_] - 1, substr($line, $-[$_], $+[$_] - $-[$_]);
}

print "\n";

for (1..$#-) {
    printf "B> %3d. -> pos%3d -%3d = '%s'\n", $_,
      $-[$_], $+[$_] - 1, eval "\$$_";
}

2 个答案:

答案 0 :(得分:6)

在Perl中,表达式的结果可能取决于其调用上下文(例如标量上下文,数组上下文或void上下文)。如果将operator =〜的值赋给数组,则该数组将包含所需的值。

@arr = ('abcd' =~ /(.)(.)(.)(.)/);

此处@arr将完全包含($1, $2, $3, $4),即('a', 'b', 'c', 'd')

答案 1 :(得分:-1)

正如@atycnth已经指出的那样,搜索的数组是由列表上下文中的正则表达式匹配返回的。

以下是使用此方法扩展的代码:

use strict;
use warnings;

# Fake Data
my $fake_data = join '', ( 'A' .. 'Z' ) x 10;

# Build a random Regular Expression
my $pattern = join '', map { '(.{' . int( 2 + rand 3 ) . '})' } ( 1 .. 3 + rand 15 );

if ( my @captures = $fake_data =~ m{\A $pattern}xms ) {
    print "$pattern\nSubstr:\n";

    for ( 1 .. $#- ) {
        printf "%5d. -> pos%3d -%3d = '%s'\n", $_, $-[$_], $+[$_] - 1, substr( $fake_data, $-[$_], $+[$_] - $-[$_] );
    }

    print "\nEval:\n";

    for ( 1 .. $#- ) {
        printf "%5d. -> pos%3d -%3d = '%s'\n", $_, $-[$_], $+[$_] - 1, eval "\$$_";
    }

    print "\nList:\n";

    my $num = 0;
    for my $capture (@captures) {
        $num++;
        printf "%5d. -> pos%3d -%3d = '%s'\n", $num, $-[$num], $+[$num] - 1, $capture;
    }

} else {
    die "No match";
}

输出:

(.{3})(.{4})(.{4})(.{4})(.{3})(.{2})(.{2})(.{4})(.{3})
Substr:
     1. -> pos  0 -  2 = 'ABC'
     2. -> pos  3 -  6 = 'DEFG'
     3. -> pos  7 - 10 = 'HIJK'
     4. -> pos 11 - 14 = 'LMNO'
     5. -> pos 15 - 17 = 'PQR'
     6. -> pos 18 - 19 = 'ST'
     7. -> pos 20 - 21 = 'UV'
     8. -> pos 22 - 25 = 'WXYZ'
     9. -> pos 26 - 28 = 'ABC'

Eval:
     1. -> pos  0 -  2 = 'ABC'
     2. -> pos  3 -  6 = 'DEFG'
     3. -> pos  7 - 10 = 'HIJK'
     4. -> pos 11 - 14 = 'LMNO'
     5. -> pos 15 - 17 = 'PQR'
     6. -> pos 18 - 19 = 'ST'
     7. -> pos 20 - 21 = 'UV'
     8. -> pos 22 - 25 = 'WXYZ'
     9. -> pos 26 - 28 = 'ABC'

List:
     1. -> pos  0 -  2 = 'ABC'
     2. -> pos  3 -  6 = 'DEFG'
     3. -> pos  7 - 10 = 'HIJK'
     4. -> pos 11 - 14 = 'LMNO'
     5. -> pos 15 - 17 = 'PQR'
     6. -> pos 18 - 19 = 'ST'
     7. -> pos 20 - 21 = 'UV'
     8. -> pos 22 - 25 = 'WXYZ'
     9. -> pos 26 - 28 = 'ABC'