将MD5从目录中的文件与数组进行比较(perl)

时间:2013-12-08 16:49:44

标签: arrays perl directory compare md5

我在这里查看此链接:How could I write a Perl script to calculate the MD5 sum of every file in a directory?

它获取指定目录中每个文件的md5。我想要做的是拿那些md5并将它们与数组进行比较。这就是我到目前为止所做的。

use warnings;
use strict;
use Digest::MD5 qw(md5_hex);

my $dirname = "./";
opendir( DIR, $dirname );
my @files = readdir(DIR);
closedir(DIR);

print "@files\n";

foreach my $file (@files) {
    if ( -d $file || !-r $file ) { next; }
    open( my $FILE, $file );
    binmode($FILE);
    print Digest::MD5->new->addfile($FILE)->hexdigest, " $file\n";
    my @array = ('667fc8db8e5519cacbf8f9f2af2e0b08');
        if (@array ~~ $FILE) {
            print "matches array", "\n";
        } else {
            print "doesnt match array", "\n";
    }
}
system ( 'pause' )


但是有了这个,我总是得到不匹配的数组,无论它是否完美匹配数组。我可以print @array,它甚至会显示该文件的相同md5值。但就像我说它只是总是说“不匹配数组”。我从来没有在任何文件上说“匹配数组”。谢谢你看:)

编辑: 这就是我现在所拥有的。

use warnings;
use strict;
use Digest::MD5 qw(md5_hex);

my $dirname = "./";
opendir( DIR, $dirname );
my @files = readdir(DIR);
closedir(DIR);

print "@files\n";

foreach my $file (@files) {
    next if -d $file || !-r $file;
    open( my $FILE, $file );
    binmode($FILE);
    #print digest::MD5->new->addfile($FILE)->hexdigest, " $file\n";
    Sdigest = Digest::MD5->new->addfile($FILE)->hexdigest, " $file\n";

    my @array = ('667fc8db8e5519cacbf8f9f2af2e0b08');
        if($digest eq $array[0]) {
            print "matches array", "\n";
        } else {
            print "doesnt match array", "\n";
    }
}
system ( 'pause' );


感谢大家的帮助。你们真棒!)

3 个答案:

答案 0 :(得分:2)

请不要使用smartmatch ~~。它在Perl的最新版本中被宣布为实验性的,并且语义可能在将来发生变化。

最佳解决方案是创建您知道的指纹散列:

my %fingerprints;
$fingerprints{"667fc8db8e5519cacbf8f9f2af2e0b08"} = undef;

如果要将整个指纹数组加载到哈希中以便我们可以轻松测试是否存在,则可以使用哈希切片

@fingerprints{@array} = ();

接下来,我们将当前文件的指纹存储在变量中:

my $digest = Digest::MD5->new->addfile($FILE)->hexdigest;

然后我们测试指纹散列中是否存在$digest

if (exists $fingerprints{$digest}) {
  print "$digest for <$file> -- FOUND\n";
}
else {
  print "$digest for <$file>\n";
}

使用散列通常比循环遍历数组更快(如果进行多次查找)。


建议的完整计划:

use strict;
use warnings;
use feature qw< say >;
use autodie;  # automatic error handling
use Digest::MD5;

my ($dirname, $fingerprint_file) = @ARGV; # takes two command line arguments
length $dirname          or die "First argument must be a directory name\n";
length $fingerprint_file or die "Second argument must be a file with fingerprints\n";

# load the fingerprints
my %fingerprints;
open my $fingerprints_fh, "<", $fingerprint_file;
while (<$fingerprints_fh>) {
  chomp;
  $fingerprints{$_} = undef;
}
close $fingerprints_fh;

opendir my $directory, $dirname;
while(my $file = readdir $directory) {
  next if not -f $file;

  open my $fh, "<:raw", "$dirname/$file";
  my $digest = Digest::MD5->new->addfile($fh)->hexdigest;
  close $fh;

  if (exists $fingerprints{$digest}) {
    say qq($digest "$file" -- FOUND);
  }
  else {
    say qq($digest "$file");
  }
}
closedir $directory;

示例调用

> perl script.pl . digests.txt

答案 1 :(得分:2)

也许以下内容会有所帮助:

use warnings;
use strict;
use Digest::MD5 qw(md5_hex);
use File::Basename;

my $dirname = './';
my %MD5s    = (
    '667fc8db8e5519cacbf8f9f2af2e0b08' => 1,
    '8c0452b597bc2c261ded598a65b043b9' => 1
);

for my $file ( grep { !-d and -r } <$dirname*> ) {
    open my $FILE, '<', $file or die $!;
    binmode $FILE;
    my $md5hexdigest = Digest::MD5->new->addfile($FILE)->hexdigest;
    close $FILE;

    print basename ($file), " md5hexdigest $md5hexdigest ";

    if ( $MD5s{$md5hexdigest} ) {
        print "matches hash", "\n";
    }
    else {
        print "doesn't match hash", "\n";
    }
}

示例输出:

XOR_String_Match.pl md5hexdigest 8c0452b597bc2c261ded598a65b043b9 matches hash
zipped.txt md5hexdigest d41d8cd98f00b204e9800998ecf8427e doesn't match hash

答案 2 :(得分:1)

像这样:

my $digest = Digest::MD5->new->addfile($FILE)->hexdigest, " $file\n";

然后

if($digest eq $array[0])

顺便说一下,(在你的代码中早些时候)可能会稍微更惯用一些:

next if -d $file || !-r $file;