Question

可能重复：
In Perl, is there a built in way to compare two arrays for equality?

我需要将数组与应该返回的函数进行比较：

如果在成对比较时所有元素相等，则为true
如果所有元素相等，或者第一个数组中的元素未定义，则成对比较时
false

换句话说，如果sub被称为“comp”：

@a = ('a', 'b', undef, 'c');
@b = ('a', 'b', 'f', 'c');
comp(@a, @b); # should return true
comp(@b, @a); # should return false

@a = ('a', 'b');
@b = ('a', 'b', 'f', 'c');
comp(@a, @b); # should return true

显而易见的解决方案是在两个数组之间进行成对比较，但我希望它比这更快，因为比较在一大组数组上运行多次，并且数组可能有很多元素。

另一方面，要比较的数组的内容（即：所有可能的@ b）是预先确定的并且不会改变。数组的元素没有固定的长度，并且无法保证它们可能包含哪些字符（制表符，逗号，您的名字）。

与成对比较相比，有更快的方法吗？智能匹配不会削减它，因为如果所有元素相等则返回true（因此，如果一个是undef则不返回）。

打包和按位比较是否可以成为一种策略？当我浏览pack / unpack和vec的文档时看起来很有希望，但我在某种程度上超出了我的深度。

感谢。

Answer 1

Perl可以在我的Macbook上比较大约100毫秒内10,000个成对元素的列表，所以我首先要说的是profile your code以确保这实际上是问题。

做一些基准测试，你可以采取一些措施来加快速度。

确保在第一次失败时保释。

假设你有很多不匹配的比较，这将节省时间的HEAPS。

检查阵列长度是否相同。

如果它们的数组长度不同，则它们永远不会匹配。比较它们的尺寸，如果它们不同则提前返回。这避免了需要在循环内反复检查这种情况。

使用迭代器而不是C风格的循环。

迭代迭代你通常会做类似for( my $idx = 0; $idx <= $#a; $idx += 2 )的事情，但迭代数组比使用C风格的循环要快。这是Perl的优化技巧，在优化的C中进行perl内部工作比在Perl代码中执行它更有效。这将使您获得约20％-30％，具体取决于您对其进行微观优化的方式。

for my $mark (0..$#{$a}/2) {
    my $idx = $mark * 2;
    next if !defined $a->[$idx] || !defined $b->[$idx];
    return 0 if $a->[$idx] ne $b->[$idx] || $a->[$idx+1] ne $b->[$idx+1];
}
return 1;

预先计算有趣的索引。

由于修复了一组对，因此可以生成定义对的索引。这使得迭代器更简单，更快。

state $indexes = precompute_indexes($b);

for my $idx ( @$indexes ) {
    next if !defined $a->[$idx];
    return 0 if $a->[$idx] ne $b->[$idx] || $a->[$idx+1] ne $b->[$idx+1];
}

return 1;

如果没有空值，则性能提升40％。除了固定集中的空值越多，你就会得到更多。

use strict;
use warnings;
use v5.10;  # for state

# Compute the indexes of a list of pairs which are interesting for
# comparison: those with defined keys.
sub precompute_indexes {
    my $pairs = shift;

    die "Unbalanced pairs" if @$pairs % 2 != 0;

    my @indexes;
    for( my $idx = 0; $idx <= $#$pairs; $idx += 2 ) {
         push @indexes, $idx if defined $pairs->[$idx];
     }

    return \@indexes;
}

sub cmp_pairs_ignore_null_keys {
    my($a, $b) = @_;

    # state is like my but it will only evaluate once ever.
    # It acts like a cache which initializes the first time the
    # program is run.
    state $indexes = precompute_indexes($b);

    # If they don't have the same # of elements, they can never match.
    return 0 if @$a != @$b;

    for my $idx ( @$indexes ) {
        next if !defined $a->[$idx];
        return 0 if $a->[$idx] ne $b->[$idx] || $a->[$idx+1] ne $b->[$idx+1];
    }

    return 1;
}

我仍然相信在自我加入的SQL中做得更好，但是还没有解决这个问题。

如何在perl中快速比较数组

1 个答案: