Question

我有一些文本，我想用以下模式提取行。

string1(string2,string3,int)

我使用perl来解析我只能为string1（string2）的规则编写代码

#!/usr/bin/perl
$txt='A (A, B, 49997 )';

$re1='((?:[a-z][a-z]+))';   # Word 1
$re2='.*?'; # Non-greedy match on filler
$re3='(\\(.*\\))';  # Round Braces 1

$re=$re1.$re2.$re3;
if ($txt =~ m/$re/is)
{
$word1=$1;
$rbraces1=$2;
print "($word1) ($rbraces1) \n";
}

Answer 1

如果您事先知道括号中的元素数量，则可以轻松匹配每个元素：

#!/usr/bin/perl
use warnings;
use strict;

my $txt='A (B, C, 49997 )';

my $id  = qr/([a-z]+)/i;
my $int = qr/([1-9][0-9]*)/;


if (my @matches = $txt =~ /$id \s* \(  $id , \s* $id , \s* $int \s* \)/x ) {
    print "($_)\n" for @matches;
}

如果标识符可以重复多次，您仍然可以使用( $id , \s* )+进行匹配，但只会返回最后一个捕获组。在这种情况下，请提取整个列表并使用split /,\s*/。

#!/usr/bin/perl
use warnings;
use strict;

my $txt='A (B, C, D, E, F, 49997 )';

my $id   = qr/[a-z]+/i;
my $int  = qr/[1-9][0-9]*/;
my $list = qr/$id (?: , \s* $id)*/x;


if (my @matches = $txt =~ /($id) \s* \( ($list) , \s* ($int) \s* \)/x ) {
    splice @matches, 1, 1, split /,\s*/, $matches[1];
    print "($_)\n" for @matches;
}

使用正则表达式解析文本

1 个答案: