在散列中查找重复项,将分组存储在新散列中

时间:2013-07-29 19:12:09

标签: perl sorting hashmap duplicates

我有以下哈希,我需要找到最顶层哈希值64之间的重复项。我尝试了一些解决方案无济于事,并且不太熟悉Perl语法以使其正常工作。

我有哈希

$VAR1 = { 
    '6' => [ '1000', '2000', '4000' ],
    '4' => [ '1000', '2000', '3000' ]
}; 

我需要的哈希

$VAR1 = {
    '6' => ['4000'],
    '4' => ['3000'],
    'Both' => ['1000','2000']
}

2 个答案:

答案 0 :(得分:1)

  1. 查找所有常见元素,例如通过使用哈希进行重复数据删除。
  2. 查找所有不常见的元素。
  3. 鉴于两个数组@x@y,这意味着:

    use List::MoreUtils 'uniq';
    
    # find all common elements
    my %common;
    $common{$_}++ for uniq(@x), uniq(@y); # count all elements
    $common{$_} == 2 or delete $common{$_} for keys %common;
    
    # remove entries from @x, @y that are common:
    @x = grep { not $common{$_} } @x;
    @y = grep { not $common{$_} } @y;
    
    # Put the common strings in an array:
    my @common = keys %common;
    

    现在剩下的就是进行一些解除引用等等,但这应该是相当简单的。

答案 1 :(得分:0)

不需要其他模块。 perl哈希非常适合查找uniq或常用值

my %both;
# count the number of times any element was seen in 4 and 6
$both{$_}++ for (@{$VAR1->{4}}, @{$VAR1->{6}});
for (keys %both) {
  # if the count is one the element isn't in both 4 and 6
  delete $both{$_} if( $both{$_} == 1 );
}
$VAR1->{Both} = [keys %both];