如何删除存储在哈希中的数组之间的重复值?

时间:2014-04-17 22:23:24

标签: arrays perl hash

我有以下哈希:

my %HASH = (
    'List1' =>  [ 'the', 'red', 'cat', 'jumps' ],
    'List2' =>  [ 'the', 'brown', 'fox', 'jumps' ],
    'List3' =>  [ 'a', 'red', 'fox', 'jumps' ],
);

我想删除这些数组中的重复元素,以便只保留唯一元素。所需的输出如下:

my %HASH = (
    'List1' =>  [ 'cat' ],
    'List2' =>  [ 'brown' ],
    'List3' =>  [ 'a' ],
);

换句话说,如果List1和List2中都存在一个元素,则应从两个列表中删除它。

我尝试过以下操作:

use strict;
use warnings;
use diagnostics;
use Data::Dumper;

foreach my $key ( keys %HASH )  {

    foreach ( @{$HASH{$key}} )  {

        if(exists($HASH{$key})){
            @{$HASH{$key}} = delete($HASH{$key});
        }
    }
}

print Dumper(\%HASH);

似乎没有做任何事情,哈希仍然是这样的。我对Perl还很陌生,所以我不确定我在哪里出错了。但是Perldoc说无论如何都不推荐调用存在于数组值上,所以任何使用其他东西的解决方案都是受欢迎的!

1 个答案:

答案 0 :(得分:2)

use strict;
use warnings;

my %hash = (
    'List1' =>  [ 'the', 'red', 'cat', 'jumps' ],
    'List2' =>  [ 'the', 'brown', 'fox', 'jumps' ],
    'List3' =>  [ 'a', 'red', 'fox', 'jumps' ],
);

# first, we count all words
my %count;
for my $words (values %hash) {
    for my $word (@$words) {
        $count{$word}++;
    }
}

# Now, we filter the words with `grep` so that only
# those remain which were found once
for my $words (values %hash) {
    @$words = grep { $count{$_} == 1 } @$words;
}

use Data::Dump;
dd \%hash;

输出:

{ List1 => ["cat"], List2 => ["brown"], List3 => ["a"] }