Question

我想创建一个在角色停止重复时分割的数组。我目前的代码是：

my $str = "1233345abcdde";
print "$_," for split /(?<=(.))(?!\1)/, $str;

返回：1,1,2,2,333,3,4,4,5,5,a,a,b,b,c,c,dd,d,e,e,

然而，我真正想要的是： 1,2,333,4,5,a,b,c,dd,e,，即没有重复的字符。

出了什么问题？我怀疑这个问题与外观的性质有关，但我不能把它固定下来......

Answer 1

没问题是因为您在split中使用了捕获组 - 它会返回＆＃34;捕获＆＃34;以及＆＃34;拆分＆＃34;。

use Data::Dumper;
my @stuff = split /(=)/, "this=that";
print Dumper \@stuff;

给出：

$VAR1 = [
          'this',
          '=',
          'that'
        ];

不幸的是，修复＆＃39;并不容易。 - 我能想出的最好的是跳过奇数元素：

my %stuff =  split /(?<=(.))(?!\1)/, $str;
print Dumper \%stuff;

（但这不会保留订购，因为哈希不会）。

但你可以：

print join (",", sort keys %stuff);

或者也许：

my $str = "1233345abcdde";
my @stuff =  split /(?<=(.))(?!\1)/, $str;
print join ( ",", @stuff[grep { not $_ & 1 } 0..$#stuff] ),"\n";

Answer 2

这会做你想要的，但你几乎肯定不应该使用它：

split /(??{ (substr $_, (pos)-1, 1) eq (substr $_, pos, 1) ? '(?!)' : '' })/, $str

Answer 3

当split正则表达式包含捕获组时，返回列表还包括这些捕获的值。你必须以某种方式过滤它们。

你想和答案一样：

Answer 4

当您使用捕获时，它们捕获的文本也会被返回。您可以过滤掉这些额外的值。

my $i; my @matches = grep { ++$i % 2 } split /(?<=(.))(?!\1)/s, $str;

use List::Util qw( pairkeys );  # 1.29+
my @matches = pairkeys split /(?<=(.))(?!\1)/s, $str;

使用正则表达式匹配更简单。

my @matches; push @matches, $1 while $str =~ /((.)\2*)/sg;

my $i; my @matches = grep { ++$i % 2 } $str =~ /((.)\2*)/sg;

use List::Util qw( pairkeys );  # 1.29+
my @matches = pairkeys $str =~ /((.)\2*)/sg;