在散列中查找重复项，将分组存储在新散列中

时间：2013-07-29 19:12:09

标签： perl sorting hashmap duplicates

我有以下哈希，我需要找到最顶层哈希值6和4之间的重复项。我尝试了一些解决方案无济于事，并且不太熟悉Perl语法以使其正常工作。

我有哈希

$VAR1 = { 
    '6' => [ '1000', '2000', '4000' ],
    '4' => [ '1000', '2000', '3000' ]
};

我需要的哈希

$VAR1 = {
    '6' => ['4000'],
    '4' => ['3000'],
    'Both' => ['1000','2000']
}

2 个答案:

答案 0 :(得分：1)

查找所有常见元素，例如通过使用哈希进行重复数据删除。
查找所有不常见的元素。

鉴于两个数组@x，@y，这意味着：

use List::MoreUtils 'uniq';

# find all common elements
my %common;
$common{$_}++ for uniq(@x), uniq(@y); # count all elements
$common{$_} == 2 or delete $common{$_} for keys %common;

# remove entries from @x, @y that are common:
@x = grep { not $common{$_} } @x;
@y = grep { not $common{$_} } @y;

# Put the common strings in an array:
my @common = keys %common;

现在剩下的就是进行一些解除引用等等，但这应该是相当简单的。

答案 1 :(得分：0)

不需要其他模块。 perl哈希非常适合查找uniq或常用值

my %both;
# count the number of times any element was seen in 4 and 6
$both{$_}++ for (@{$VAR1->{4}}, @{$VAR1->{6}});
for (keys %both) {
  # if the count is one the element isn't in both 4 and 6
  delete $both{$_} if( $both{$_} == 1 );
}
$VAR1->{Both} = [keys %both];