我有一个包含超过10k元素的列表。我想删除每个第三个元素。
例如,
@testlists = qw (helloworld sessions first.cgi login localpcs depthhashes.cgi search view macros plugins ...) ;
我想从原始数组中删除first.cgi
,depthhashses.cgi
,macros
等。 Grep功能稍慢一点。请建议我更快的grep搜索或任何其他类似的子程序。任何帮助都将受到高度赞赏
答案 0 :(得分:7)
我可以想到一些解决方案:
Grep on index divisibility
my $i = 0;
@testlist = grep { ++$i % 3 } @testlist;
重复拼接
for (my $i = 2; $i < $#testlist; $i += 2) {
splice @testlist, $i, 1;
}
使用跳过进行复制
my @output;
# pre-extend the array for fewer reallocations
$#output = @testlist * 2/3;
@output = ();
# FIXME annoying off-by one errors at the end that can add one undef
for (my $i = 0; $i < @testlist; $i += 3) {
push @output, @testlist[$i, $i+1];
}
Ikegami在他出色的答案中纠正并优化了复制解决方案。
具有1,000个元素列表的基准声明拼接明显的赢家:
Rate slice grep copy splice
slice 790/s -- -10% -18% -37%
grep 883/s 12% -- -8% -29%
copy 960/s 22% 9% -- -23%
splice 1248/s 58% 41% 30% --
(slice
是暴徒的解决方案)
这可能是因为它将大部分实际工作卸载到C级实现中,并避免了分配和昂贵的Perl级操作。
拥有10,000个元素的列表,优势转向其他解决方案。实际上,拼接解决方案的算法复杂度非常差,因为它会在所有拼接位置之后移动所有元素,这意味着最后一个元素移动了近3333次:
Rate splice slice grep copy
splice 42.7/s -- -35% -42% -49%
slice 65.3/s 53% -- -12% -23%
grep 74.2/s 74% 14% -- -12%
copy 84.4/s 98% 29% 14% --
答案 1 :(得分:4)
amon的copy
可以调整为快30%!
my $i = 1;
my $j = 1;
while ($i < @a) {
$a[++$j] = $a[$i+=2];
$a[++$j] = $a[++$i];
}
$#a = $j-1 if @a>2;
你可以通过避免复制任何东西来获得更快的速度(尽管结果是对数组的引用)。每个元素中的字符串越长,这样做的效果就越好。
my $i = 0;
my $ref = sub { \@_ }->( grep { ++$i % 3 } @a );
结果(10,000个元素):
>perl a.pl
Rate splice_a splice_r grep_a copy1_a copy1_r copy2_r copy2_a grep_r
splice_a 52.8/s -- -0% -51% -54% -56% -66% -66% -68%
splice_r 52.9/s 0% -- -51% -54% -55% -66% -66% -68%
grep_a 107/s 103% 103% -- -7% -10% -30% -31% -34%
copy1_a 115/s 118% 117% 7% -- -3% -25% -26% -30%
copy1_r 119/s 125% 124% 11% 3% -- -23% -23% -27%
copy2_r 154/s 191% 190% 43% 34% 29% -- -0% -6%
copy2_a 154/s 192% 192% 44% 34% 30% 0% -- -6%
grep_r 163/s 209% 209% 52% 42% 37% 6% 6% --
基准:
use strict;
use warnings;
use Benchmark qw( cmpthese );
my @testlist = qw( helloworld sessions first.cgi login localpcs depthhashes.cgi search view macros );
@testlist = ( @testlist ) x ( 10000 / @testlist );
sub grep_a { my @a = @testlist; my $i = 0; @a = grep { ++$i % 3 } @a; 1 }
sub copy1_a { my @a = @testlist;
my @b;
$#b = $#a; @b = (); # Does absolutely nothing in this benchmark because of optimisations in Perl.
for (my $i = 0; $i < @a; $i += 3) {
push @b, @a[$i, $i+1];
}
1
}
sub copy2_a { my @a = @testlist;
my $i = 1;
my $j = 1;
while ($i < @a) {
$a[++$j] = $a[$i+=2];
$a[++$j] = $a[++$i];
}
$#a = $j-1 if @a>2;
1
}
sub splice_a { my @a = @testlist;
for (my $i = 2; $i < $#a; $i += 2) {
splice @a, $i, 1;
}
1
}
sub grep_r { my $r = [ @testlist ]; my $i = 0; $r = sub { \@_ }->( grep { ++$i % 3 } @$r ); 1 }
sub copy1_r { my $r = [ @testlist ];
my @b;
$#b = $#$r; @b = (); # Does absolutely nothing in this benchmark because of optimisations in Perl.
for (my $i = 0; $i < @$r; $i += 3) {
push @b, @$r[$i, $i+1];
}
$r = \@b;
1
}
sub copy2_r { my $r = [ @testlist ];
my $i = 1;
my $j = 1;
while ($i < @$r) {
$r->[++$j] = $r->[$i+=2];
$r->[++$j] = $r->[++$i];
}
$#$r = $j-1 if @$r>2;
1
}
sub splice_r { my $r = [ @testlist ];
for (my $i = 2; $i < $#$r; $i += 2) {
splice @$r, $i, 1;
}
1
}
cmpthese(-3, {
grep_a => \&grep_a,
copy1_a => \©1_a,
copy2_a => \©2_a,
splice_a => \&splice_a,
grep_r => \&grep_r,
copy1_r => \©1_r,
copy2_r => \©2_r,
splice_r => \&splice_r,
});
答案 2 :(得分:2)
使用数组切片。
@testlists = @testlists[ grep { ($_+1) % 3 } 0..$#testlists ];
答案 3 :(得分:0)
我不确定你使用grep的意思,但也许你的意思是这样的。
for $i (0 .. $#testlists) {
if (($i % 3) == 2) {
delete $testlists[$i];
}
}
# Demonstrate the values.
foreach $e (@testlists) {
print "$e\n";
}