使用perl在散列值数组中查找重复值

时间:2014-08-02 07:15:50

标签: perl

我想找到哈希值的重复值,这是一个数组。我能够获得各个键的重复值。但我想获得所有键的数组的重复值 例如,在下面的源代码中,我需要将输出作为

2 has the duplicate value d
3 has the duplicate value a
1 and 3 has duplicate value a

尝试了源代码:

use strict;
use warnings;
my @test1 = qw(a b c);
my @test2 = qw(d e d);
my @test3 = qw(a h a);
my %hash = ( "1" => \@test1,
             "2" => \@test2,
             "3" => \@test3,
            );
while( my ($key, $values) = each %hash ) {
  my %seen = ();
  my @dup = map { 1==$seen{$_}++ ? $_ : () } @$values;
  if( $#dup > -1 ) {
    my $dupkey = join (" ",@dup);
    print "$key has the duplicate value $dupkey\n";
  }
}

4 个答案:

答案 0 :(得分:0)

我相信它可以更优雅地完成,但我的Perl现在有点生疏了:)顺便说一下好运动:)

use strict;
use warnings;
my @test1 = qw(a b c);
my @test2 = qw(d e d);
my @test3 = qw(a h a e);
my @test4 = qw(b d f e);
my %hash = ( "1" => \@test1,
             "2" => \@test2,
             "3" => \@test3,
             "4" => \@test4
            );
my %seen;
while( my ($key, $values) = each %hash ) {
  foreach (@{$values}) { 
    if (exists $seen{$_}) {
      if (exists $seen{$_}{$key}) {
        $seen{$_}{$key}+=1;
      } else {
        $seen{$_}{$key}=1;
      }
    } else {
      $seen{$_}{$key} = 1;
    }
  }
  my @dup = grep { exists $seen{$_}{$key} && $seen{$_}{$key}>1 } keys %seen;
  if( $#dup > -1 ) {
    my $dupkey = join (" ",@dup);
    print "$key has the duplicate value $dupkey\n";
  }
}
my @dupall = sort grep { scalar(keys $seen{$_})>1 } keys %seen;
foreach( @dupall ) {
    my @ardupkeys = sort keys $seen{$_};
    my $dupkeys = join(" and ",(join(",", @ardupkeys[0..$#ardupkeys-1]),$ardupkeys[-1]));
    print "$dupkeys have the duplicate value $_\n";
  }

输出:

3 has the duplicate value a
2 has the duplicate value d
1 and 3 have the duplicate value a
1 and 4 have the duplicate value b
2 and 4 have the duplicate value d
2,3 and 4 have the duplicate value e

答案 1 :(得分:0)

use strict; use warnings;

my @test1 = qw(a b c);
my @test2 = qw(d e d);
my @test3 = qw(a h a);
my %hash = ( "1" => \@test1,
             "2" => \@test2,
             "3" => \@test3,
            );

my %seen_in;

# sorted values look better
for my $key ( sort keys %hash ) {
  my $values = $hash{$key};
  my %count = ();

  my @dup = map { push(@{$seen_in{$_}},$key) unless $count{$_};  1==$count{$_}++ ? ($_) : () } @$values;
  for my $dupkey (@dup) {
    print "$key has the duplicate value $dupkey\n";
  }
}

for my $dupval (map {@{$seen_in{$_}}>1 ? ($_):()} sort keys %seen_in) {
   print join(" and ", @{$seen_in{$dupval}}), " have the duplicate value $dupval\n";
}

答案 2 :(得分:0)

在CPAN和朋友的帮助下,您可以编写一个可以很好地扩展的解决方案。

use strict;
use warnings;
use Algorithm::Combinatorics qw(combinations);
use Array::Utils qw(intersect);


my @test1 = qw(a b c);
my @test2 = qw(d e d);
my @test3 = qw(a h a);
my %hash = ( "1" => \@test1,
             "2" => \@test2,
             "3" => \@test3,
);

my @keys = keys %hash;

for my $key (keys %hash) {
    my %seen;
    $seen{$_}++ for @{$hash{$key}};
    print "$key has the duplicate value $_\n" for map {$seen{$_} > 1 ? $_ : ()} keys %seen;
}

my $combination = combinations(\@keys, 2);
while (my $pair = $combination->next) {
    my %common  = map { $_ => 1 } intersect(@{$hash{@$pair[0]}}, @{$hash{@$pair[1]}});
    print "@$pair[0] and @$pair[1] has duplicate value $_\n" for keys %common;
}

输出

3 has the duplicate value a
2 has the duplicate value d
1 and 3 has duplicate value a

答案 3 :(得分:0)

这是一个为Andrzej的版本构建类似哈希的解决方案,但避免使用map

我假设您只想知道某个值是否在任何集中出现多次。如果您确切需要在问题中显示的内容,请参阅下文。

use strict;
use warnings;

my @test1 = qw(a b c);
my @test2 = qw(d e d);
my @test3 = qw(a h a);

my %hash = (
  1 => \@test1,
  2 => \@test2,
  3 => \@test3,
);

my %appears;

while (my ($set, $contents) = each %hash) {
  ++$appears{$_}{$set} for @$contents;
}

for my $item (sort keys %appears) {
  my $sets = $appears{$item};
  my $count;
  $count += $_ for values %$sets;
  next unless $count > 1;

  my @sets = sort { $a <=> $b } keys %$sets;

  printf "%s %s the duplicate value %s\n",
      join(', ', @sets) =~ s/,([^,]+\z)/ and$1/r,
      @sets > 1 ? 'have' : 'has',
      $item;
}

<强>输出

1 and 3 have the duplicate value a
2 has the duplicate value d

如果你真的想在中重复,那么一套集与所有集合分别报告,那么只需更改显示代码。用这个

替换最后的for循环
for my $item (sort keys %appears) {
  my $sets = $appears{$item};
  my @sets = grep { $sets->{$_} > 1 } sort { $a <=> $b } keys %$sets;
  print "@sets has the duplicate value $item\n" if @sets;
}

for my $item (sort keys %appears) {
  my $sets = $appears{$item};
  next unless keys %$sets > 1;

  my @sets = sort { $a <=> $b } keys %$sets;

  printf "%s have the duplicate value %s\n",
      join(', ', @sets) =~ s/,([^,]+\z)/ and$1/r,
      $item;
}

<强>输出

3 has the duplicate value a
2 has the duplicate value d
1 and 3 have the duplicate value a