Question

perl Digest模块为add和addfile函数计算不同的SHA1摘要。我使用/dev/urandom

创建了二进制随机数据

在ubuntu上运行

$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 12.04.1 LTS
Release:        12.04
Codename:       precise

$ perl -v
This is perl 5, version 14, subversion 2 (v5.14.2) built for i686-linux-gnu-thread-multi-64int

脚本输出

$ perl t.pl sha1 a.tmp
doesntwork      da39a3ee5e6b4b0d3255bfef95601890afd80709
works           ee49451434cffe001a568090c86f16f076677af5
$ openssl dgst -sha1 a.tmp
SHA1(a.tmp)= ee49451434cffe001a568090c86f16f076677af5

关注我的代码

use strict;
use warnings;
use Switch;
use Digest;

sub doesntwork {
    my ($datafile, $hashfun) = @_;
    open(my $fh, "<", $datafile ) or die "error: Can't open '$datafile', $!\n";
    binmode($fh);
    read($fh, my $data, -s $datafile);
    close($fh);

    $hashfun->add($data);
    my $hashval = $hashfun->digest();

    return unpack('H*', $hashval);
}

sub works {
    my ($datafile, $hashfun) = @_;
    open(my $fh, "<", $datafile ) or die "error: Can't open '$datafile', $!\n";
    binmode($fh);

    $hashfun->addfile($fh);
    my $hashval = $hashfun->digest();

    close($fh);

    return unpack('H*', $hashval);
}

###############################################################################
(@ARGV >= 2) or die "usage: perl $0 algo datafile\n";
my ($algo, $datafile) = @ARGV;

my $hashfun;
switch($algo) {
    case "md5"    {$hashfun = Digest->new("MD5"    );}
    case "sha1"   {$hashfun = Digest->new("SHA-1"  );}
    case "sha256" {$hashfun = Digest->new("SHA-256");}
    case "sha512" {$hashfun = Digest->new("SHA-512");}
    else          {die "error: invalid algorithm '$algo'\n"}
}

print "doesntwork\t", doesntwork( $datafile, $hashfun ), "\n";
print "works     \t", works     ( $datafile, $hashfun ), "\n";

我希望add函数能够工作，因为我想在缓冲数据上计算它，而不是从文件数据计算它。可能add将数据视为文本，而对于addfile，文件句柄上的binmod使其使用二进制数据，如果是这样，我如何使add将缓冲区视为二进制数据

编辑帖子以打印读取数据的大小 -

    $ stat -c "%n %s" a.tmp
    a.tmp 671088640

    $ openssl dgst -sha1 a.tmp
    SHA1(a.tmp)= 7dfcced1b0c8864e6a20b2daa63de7ffc1cd7a26

    #### Works
    $ perl -W -MDigest -e 'open(my $fh, "<", "a.tmp") or die "cant open $!\n";
    > binmode($fh);
    > my $hf = Digest->new("SHA-1");
    > $hf->addfile($fh);
    > print unpack("H*", $hf->digest()),"\n";
    > close($fh);'
    7dfcced1b0c8864e6a20b2daa63de7ffc1cd7a26

    #### Doesnt Work
    $ perl -W -MDigest -e 'open(my $fh, "<", "a.tmp") or die "cant open $!\n";
    > binmode($fh);
    > read($fh, my $data, -s "a.tmp") or die "cant read $!\n";
    > close($fh);
    > printf("## data.length=%d,file.length=%d\n",length($data),-s "a.tmp");
    > length($data)==(-s "a.tmp") or die "couldnt read all the data";
    > my $hf = Digest->new("SHA-1");
    > $hf->add($data);
    > print unpack("H*", $hf->digest()),"\n";'
    ## data.length=671088640,file.length=671088640
    9eecafd368a50fb240e0388e3c84c0c94bd6cc2a

根据弗雷德的回答也试过

    $ perl -W -MDigest -e '
    > open(my $fh, "<", "a.tmp") or die "cant open $!\n";
    > binmode($fh);
    > my $size = -s "a.tmp";
    > my $got = read($fh, my $data, $size) or die "cant read $!\n";
    > print "##read $got bytes, size=$size\n";
    > my $done = $size - $got;
    > print "done=$done, size=$size, got=$got\n";
    > until(!$done) {
    >   $got   = read($fh, my $newdata, $done);
    >   $done -= $got ;
    >   $data .= $newdata;
    >   print "##read1 $got bytes, size=$size, done=$done\n";
    > }
    > close($fh);
    > printf("## data.length=%d,file.length=%d\n",length($data),-s "a.tmp");
    > length($data)==(-s "a.tmp") or die "couldnt read all the data";
    > my $hf = Digest->new("SHA-1");
    > $hf->add($data);
    > print unpack("H*", $hf->digest()),"\n";'
    ##read 671088640 bytes, size=671088640
    done=0, size=671088640, got=671088640
    ## data.length=671088640,file.length=671088640
    9eecafd368a50fb240e0388e3c84c0c94bd6cc2a

Answer 1

您尚未提供产生问题的数据，但我无法使用Perl脚本作为输入来复制您的问题。

以下是addfile：

的定义

sub addfile {
    my ($self, $handle) = @_;

    my $n;
    my $buf = "";

    while (($n = read($handle, $buf, 4*1024))) {
        $self->add($buf);
    }
    unless (defined $n) {
    require Carp;
    Carp::croak("Read failed: $!");
    }

    $self;
}

您声称addfile有效且add没有多大意义。我想在处理长字符串时模块中可能存在错误，但是你更有可能将不同的输入传递给模块。

Answer 2

您需要测试read的返回值。无法保证您已阅读该文件的全部内容。

perl中的

read通常实现为对底层系统调用fread的调用。当您使用这样的低级读取时，您必须测试返回值看看你是否得到了你所要求的那么多。

$size = -s $datafile ; 
$got = read($fh, my $data, $size);
$done = $size - $got ; 
until ( $done ) {
     $got = read($fh, my $newdata, $done ); 
     $done -= $got ; 
     $data .= $mydata ;     
}

这只是我的头顶，可能有一个明显的fencepost错误。这就是为什么我尽可能避免使用read。请参阅http://perltricks.com/article/21/2013/4/21/Read-an-entire-file-into-a-string了解一些不那么痛苦的方法。

perl Digest add addfile计算不同的SHA1摘要

2 个答案: