Perl正则表达式可以按照它们的位置捕获和分割字符串吗?

时间:2017-04-19 17:22:24

标签: regex perl

我在下面有一个示例字符串,我希望使用括号中的某些图案进行拆分。它们必须按照字符串的位置顺序拆分,所以当我加入它们时它们仍然是相同的。

my(@strArr)= $ str =〜/ ^(。*?)|((。*?)))$ /;

  1. ABC(DEF)GHI
    结果:abc,(def),ghi

  2. abc(def)ghi(jkl)
    结果:abc,(def),ghi,(jkl)

  3. ABCDEF(GHI)
    结果:abcdef,(ghi)

  4. (ABC)
    结果:(abc)

  5. (abcd)efg
    结果:(abcd),efg

  6. 这些只能使用一行正则表达式代码吗?这些需要存储到@strArr

3 个答案:

答案 0 :(得分:3)

您可以使用与(\([^()]*\))符号匹配的(模式拆分字符串,然后将除()之外的零个或多个字符拆分,然后使用文字{ {1}},并将捕获匹配到组1中的整个子字符串,以便Perl可以将它放到结果数组中。

唯一的缺点是您需要删除空匹配(使用)),但整体解决方案看起来很可读:

grep {/\S/}

以上demo code的输出:my $str = "abc(def)ghi"; my $regexp = qr/( \( [^()]* \) )/x; my @strArr = grep {/\S/} split /$regexp/, $str; print join(", ", @strArr);

答案 1 :(得分:1)

使用否定字符类[^...]

my (@strArr) = $str =~ /[^\s(]+|\([^)]*\)/g;

模式细节:

/
[^\s(]+    # one or more characters that aren't opening round brackets or white-spaces
|        # OR
\(         # a literal opening round bracket
[^)]*      # zero or more characters that aren't closing round brackets
\)         # a literal closing round bracket
/g # perform a global research

答案 2 :(得分:1)

我尝试过Wiktor's和Casimir的例子。两者都运作良好。

#!/usr/bin/perl
use strict;
use warnings;

my %testHash = (
    '0' => '',
    '1' => 'abc(def)ghi',
    '2' => 'abc(def)ghi(jkl)',
    '3' => 'abcdef(ghi)',
    '4' => '(abc)',
    '5' => '(abcd)efg'
);

# Solution 1
print "By Wiktor:\n";
foreach my $key ( sort keys %testHash ) {
    my $str = $testHash{$key};
    my $regexp = qr/( \( [^()]* \) )/x;
    my @strArr = grep {/\S/} split /$regexp/, $str;

    print "$str - ".join(", ", @strArr)."\n";
}

# Solution 2
print "\nBy Casimir:\n";
foreach my $key ( sort keys %testHash ) {
    my $str = $testHash{$key};
    my (@strArr) = $str =~ /[^\s(]+|\([^)]*\)/g;

    print "$str - ".join(", ", @strArr)."\n";
}




By Wiktor:
 -
abc(def)ghi - abc, (def), ghi
abc(def)ghi(jkl) - abc, (def), ghi, (jkl)
abcdef(ghi) - abcdef, (ghi)
(abc) - (abc)
(abcd)efg - (abcd), efg

By Casimir:
 -
abc(def)ghi - abc, (def), ghi
abc(def)ghi(jkl) - abc, (def), ghi, (jkl)
abcdef(ghi) - abcdef, (ghi)
(abc) - (abc)
(abcd)efg - (abcd), efg