比较perl中哈希中的键值对

时间:2014-08-19 12:10:42

标签: perl compare hash

我有一个带键值的散列作为标量字符串。该值是另一个散列,其中字符串的单词为键,其频率为值。

结构:

 { 
  doc1 => { w1 => freq1 , w2 => freq2, .....} ,
  doc2 => { w1 => freq1 , w2 => freq2, .....} ,
  .
  .
  .
}

我想比较两个键(doc1,doc2 ......)并找到两个文档之间的常用词。所需的输出是两个文档之间的常用单词的频率之和,对于所有文档对。

哪种方法最好?

1 个答案:

答案 0 :(得分:0)

类似

#!/usr/bin/perl
use strict;
use warnings;

# Sum of frequencies
my @frequencies;

# First doc
my $doc1 = {
    w1 => 1 , w2 => 5, w3 => 1
};

# Second doc
my $doc2 = {
    w1 => 3 , w2 => 2, w3 => 1, w4 => 12
};

# see first doc
foreach my $word (keys %{$doc1}) {
    if (exists $doc2->{$word}) {
        push (@frequencies, {$word => $doc1->{$word} + $doc2->{$word}});
    }
    else {
        push (@frequencies, {$word => $doc1->{$word}});
    }

    delete $doc2->{$word};
}

# see second doc
foreach my $word (keys %{$doc2}) {
    push (@frequencies, {$word => $doc2->{$word}});
}

# See sum of frequencies
print join "\n", map {sprintf("%s: %s", keys %$_, values %$_)} @frequencies;

1;

输出

$ perl compare.pl
w3: 2
w1: 4
w2: 7
w4: 12