我在删除结果"a","b","b","a","c"
后尝试删除字符串"a","b","c",
中的重复项。我已经实现了这一点,但我对正则表达式替换的工作存在疑问
use warnings;
use strict;
my $s = q+"a","b","b","a","c"+;
$s=~s/ ("\w"),? / ($s=~s|($1)||g)?"$1,":"" /xge;
#^ ^
#| Consider this as s2
#Consider this as s1
print "\n$s\n\n";
s1
值包含字符串"a","b","b","a","c"
第1步
替换后:
猜猜,数据包含来自以下s1
或"a","b","b","c"
或"a","b","b","a","c"
数据的,"b","b",,"c"
变量。?
我已经使用eval分组运行正则表达式
$s=~s/ ("\w"),? (?{print "$s\n"})/ ($s=~s|($1)||g)?"$1,":"" /xge;
结果是
"a","b","b","a","c"
,"b","b",,"c" #This is from after substitution
,,,,"c"
,,,,"c"
,,,,"c"
现在我的dobut是s2
变量$s
为什么它不与s1
连接,这意味着在第二步结果应该是"a","b","b","c"
(所有字符串) "a"
替换为空,a
中添加了$s
。
被修改
评估分组的结果是(?{print $s})
"a","b","b","a","c"
,"b","b",,"c"
,,,,"c"
,,,,"c"
,,,,"c"
在替换行之后,我打印了$s
变量,它正在给出"a","b","c"
,这个输出是如何产生的。?
答案 0 :(得分:6)
正则表达式(在我看来)是在这里使用的错误工具。我会
split
逗号split
join
列表返回字符串像这样:
#!/usr/bin/perl
use strict;
use warnings;
use feature 'say';
my $str = q["a","b","b","a","c"];
my %seen;
$str = join ',',
grep { ! $seen{$_}++ }
split /,/, $str;
say $str;
答案 1 :(得分:2)
对此的正确解决方案是拆分,过滤,重新加入,如@Dave Cross已经证明的那样。
...
然而,以下正则表达式解决方案确实有效,并且有希望证明Dave的解决方案优越性
#!/usr/bin/env perl
use v5.10;
use strict;
use warnings;
my $str = q{"a","b","b","a","c"};
1 while $str =~ s{
\A
(?: (?&element) , )*
( (?&element) ) # Capture in \1
(?: , (?&element) )*
\K
,
\1 # Remove the duplicate along with preceding comma
(?= \z | , )
(?(DEFINE)
(?<element>
"
\w
"
)
)
}{}xg;
say $str;
输出:
"a","b","c"