Question

我试图在两个包含键/值条目的文件中查找差异，并返回添加或删除所有键/值的内容。目前，我正在使用linux diff来查找差异，但是自然而然的是，如果更改值顺序，那么它将是有效的差异，但是我不想列出它们，因为对我而言无效一个。

文件1：

key1    kamal1.google.com kamal2.google.com kamal3.google.com 
key2    kamal4.google.com

文件2：

key1    kamal1.google.com kamal6.google.com kamal3.google.com 
key3    kamal4.google.com

我需要什么

显示deleted key2 with values kamal4.google.com，added key3 with kamal4.google.com，deleted kamal2.google.com from key1，added kamal6.google.com to key1
消息具有代表性，我们可以将其修改为更有意义的消息

我的处理方法是

读取文件并放入不同的哈希key1 => {kamal1.google.com => 1, ...}, key2 => {kamal4.google.com => 1}。我也将数组作为哈希值，以便我们高效地进行比较。
遍历两个散列的键并查找它是否存在于哪个散列中。
进行递归调用以查找值的差异（因为它再次是哈希）

我的代码问题：
-不适用于嵌套
-失去父母的踪影。

代码：

my $file1 = 'file1';
my $file2 = 'file2';

my $old = hashifyFile($file1);
my $new = hashifyFile($file2);
my $result = {};
compareHashes($old , $new, $result);
print Dumper $result;

    sub compareHashes {
        my ($hash1, $hash2, $result) = @_;

            for my $key (keys %$hash1, keys %$hash2) {
                if (not exists $hash2->{$key}) {
                        push @{$result->{deleted}->{$key}}, keys %{$hash1->{$key}};
                } elsif (not exists $hash1->{$key}) {
                        push @{$result->{added}->{$key}}, keys %{$hash2->{$key}};
                } elsif (ref $hash1->{$key} eq 'HASH' or ref $hash2->{$key} eq 'HASH' ) {
                    compareHashes($hash1->{$key}, $hash2->{$key}, $result);
                }
            }
    }

# helper functions
sub trim {
   my $val = shift;
   $val =~ s/^\s*|\s*$//g;
   return $val;
}


sub hashifyFile {
    my $file = shift;
    my $contents = {};
    open my $file_fh, '<', $file or die "couldn't open $file $!";

    my ($key, @val);
    while (my $line = <$file_fh>) {
        # skip blank lines and comments
        next if $line =~ /^\s*$/;
        next if $line =~ /^#/;
        # print "$. $line";

        # if line starts with a word, means its "key values"
        # if it starts with multiple spaces assuming minimum 4, seems values for the previous key
        if ($line =~ /^\w/) {
            ($key, @val) = split /\s+|=/, $line;
        } elsif ($line =~ /^\s{4,}\w/) {
            push @val, split /\s+/, $line;
        }
        my %temp_hash;
        for (@val) {
                # next unless $_;
                $temp_hash{trim($_)} = 1 if trim($_);
        }
        $key = trim($key);
        $contents->{$key} = \%temp_hash if defined $key;

    }

    close $file_fh;
    return $contents;
}

Answer 1

以下是根据您的描述如何执行此操作的示例。请说明这是否是您想要的。

sub compareHashes {
    my ($hash1, $hash2, $result, $parent) = @_;

    my %all_keys = map {$_ => 1} keys %$hash1, keys %$hash2;

    for my $key (keys %all_keys) {
        if (not exists $hash2->{$key}) {
            if ( defined $parent ) {
                push @{$result->{deleted}->{$parent}}, $key;
            }
            else {
                push @{$result->{deleted}->{$key}}, keys %{$hash1->{$key}};
            }
        } elsif (not exists $hash1->{$key}) {
            if ( defined $parent ) {
                push @{$result->{added}->{$parent}}, $key;
            }
            else {
                push @{$result->{added}->{$key}}, keys %{$hash2->{$key}};
            }
        }
        else {
            if ((ref $hash1->{$key} eq 'HASH') and (ref $hash2->{$key} eq 'HASH') ) {
                compareHashes($hash1->{$key}, $hash2->{$key}, $result, $key);
            }
        }
    }
}

输出：

$VAR1 = {
          'added' => {
                       'key3' => [
                                   'kamal4.google.com'
                                 ],
                       'key1' => [
                                   'kamal6.google.com'
                                 ]
                     },
          'deleted' => {
                         'key2' => [
                                     'kamal4.google.com'
                                   ],
                         'key1' => [
                                     'kamal2.google.com'
                                   ]
                       }
        };

Answer 2

CPAN上有几个模块，它们比较深层嵌套的数据结构。它们的主要不同之处在于如何编码差异。以下是精选列表：

查找两个Perl嵌套哈希之间的区别

2 个答案: