Question

我想使用多种算法对给定文件进行哈希，但现在我按顺序执行，如下所示：

return [
    hash_file('md5', $uri),
    hash_file('sha1', $uri),
    hash_file('sha256', $uri)
];

有没有反正这个文件只打开一个流而不是N，其中N是我想要使用的算法数量？像这样：

return hash_file(['md5', 'sha1', 'sha256'], $uri);

Answer 1

您可以打开文件指针，然后使用hash_init()和hash_update()计算文件上的哈希值，而无需多次打开文件，然后使用hash_final()获取生成的哈希值。

<?php
function hash_file_multi($algos = [], $filename) {
    if (!is_array($algos)) {
        throw new \InvalidArgumentException('First argument must be an array');
    }

    if (!is_string($filename)) {
        throw new \InvalidArgumentException('Second argument must be a string');
    }

    if (!file_exists($filename)) {
        throw new \InvalidArgumentException('Second argument, file not found');
    }

    $result = [];
    $fp = fopen($filename, "r");
    if ($fp) {
        // ini hash contexts
        foreach ($algos as $algo) {
            $ctx[$algo] = hash_init($algo);
        }

        // calculate hash
        while (!feof($fp)) {
            $buffer = fgets($fp, 65536);
            foreach ($ctx as $key => $context) {
                hash_update($ctx[$key], $buffer);
            }
        }

        // finalise hash and store in return
        foreach ($algos as $algo) {
            $result[$algo] = hash_final($ctx[$algo]);
        }

        fclose($fp);
    } else {
        throw new \InvalidArgumentException('Could not open file for reading');
    }   
    return $result;
}

$result = hash_file_multi(['md5', 'sha1', 'sha256'], $uri);

var_dump($result['md5'] === hash_file('md5', $uri)); //true
var_dump($result['sha1'] === hash_file('sha1', $uri)); //true
var_dump($result['sha256'] === hash_file('sha256', $uri)); //true

还发布到PHP手册：http://php.net/manual/en/function.hash-file.php#122549

Answer 2

以下是Lawrence Cherone's solution *的修改，只读取一次文件，甚至适用于STDIN等不可搜索的流：

<?php
function hash_stream_multi($algos = [], $stream) {
    if (!is_array($algos)) {
        throw new \InvalidArgumentException('First argument must be an array');
    }

    if (!is_resource($stream)) {
        throw new \InvalidArgumentException('Second argument must be a resource');
    }

    $result = [];
    foreach ($algos as $algo) {
        $ctx[$algo] = hash_init($algo);
    }
    while (!feof($stream)) {
        $chunk = fread($stream, 1 << 20);  // read data in 1 MiB chunks
        foreach ($algos as $algo) {
            hash_update($ctx[$algo], $chunk);
        }
    }
    foreach ($algos as $algo) {
        $result[$algo] = hash_final($ctx[$algo]);
    }
    return $result;
}

// test: hash standard input with MD5, SHA-1 and SHA-256
$result = hash_stream_multi(['md5', 'sha1', 'sha256'], STDIN);
print_r($result);

Try it online!

它的工作原理是从输入流中读取带有fread()的数据块（1兆字节，这应该在性能和内存使用之间给出一个合理的平衡），然后用{{3}为每个哈希提供块。 }。

^{*）当我写这篇文章时，劳伦斯更新了他的答案，但我觉得我的答案仍然足够明显，不足以保证他们两个。这个解决方案与Lawrence的更新版本之间的主要区别在于我的函数采用输入流而不是文件名，并且我使用的是fread()而不是fgets()（因为对于散列，没有必要将输入拆分为换行符）。功能}

如何在PHP中同时使用多种算法散列文件？

2 个答案: