Question

这是关于Linux /dev/urandom的内核实现的问题。如果用户要求读取大量数据（千兆字节）并且未将熵添加到池中，是否可以根据当前数据预测从urandom生成的下一个数据？

通常的情况是当熵经常被添加到池中时，但在我的情况下我们可以考虑，没有额外的熵（例如，通过内核修补禁用了它的添加）。所以在我的例子中，问题是关于urandom算法本身。

来源是/drivers/char/random.c或http://www.google.com/codesearch#KMCRKdMbI4g/drivers/char/random.c&q=urandom%20linux&type=cs&l=116

或http://lxr.linux.no/linux+v3.3.3/drivers/char/random.c

 // data copying loop
    while (nbytes) {
            extract_buf(r, tmp);

            memcpy(buf, tmp, i);
            nbytes -= i;
            buf += i;
            ret += i;
    }

static void extract_buf(struct entropy_store *r, __u8 *out)
{
        int i;
        __u32 hash[5], workspace[SHA_WORKSPACE_WORDS];
        __u8 extract[64];

        /* Generate a hash across the pool, 16 words (512 bits) at a time */
        sha_init(hash);
        for (i = 0; i < r->poolinfo->poolwords; i += 16)
                sha_transform(hash, (__u8 *)(r->pool + i), workspace);

        /*
         * We mix the hash back into the pool to prevent backtracking
         * attacks (where the attacker knows the state of the pool
         * plus the current outputs, and attempts to find previous
         * ouputs), unless the hash function can be inverted. By
         * mixing at least a SHA1 worth of hash data back, we make
         * brute-forcing the feedback as hard as brute-forcing the
         * hash.
         */
        mix_pool_bytes_extract(r, hash, sizeof(hash), extract);

        /*
         * To avoid duplicates, we atomically extract a portion of the
         * pool while mixing, and hash one final time.
         */
        sha_transform(hash, extract, workspace);
        memset(extract, 0, sizeof(extract));
        memset(workspace, 0, sizeof(workspace));

        /*
         * In case the hash function has some recognizable output
         * pattern, we fold it in half. Thus, we always feed back
         * twice as much data as we output.
         */
        hash[0] ^= hash[3];
        hash[1] ^= hash[4];
        hash[2] ^= rol32(hash[2], 16);
        memcpy(out, hash, EXTRACT_SIZE);
        memset(hash, 0, sizeof(hash));
}

有一种回溯预防机制，但是“前进轨道”呢？

例如：我从urandom做了500 MB的单个读取系统调用，并且已知所有数据高达200-MB并且池中没有额外的熵，我能预测201兆字节将是多少？

Answer 1

原则上，是的，你可以预测。当没有可用的熵时，dev / urandom变为PRNG，并且一旦其内部状态已知，其原则上可以预测其输出。在实践中它并不那么简单，因为内部状态相当大，并且散列函数阻止我们从输出向后工作。它可以通过反复试验来确定，但这可能需要很长时间。

Answer 2

“加密强伪随机数生成器”的定义是，将其输出与真随机数生成器的输出区分开来在计算上是不可行的。如果你可以预测过去输出的未来输出，你可以这样区分;除非Linux urandom算法很弱，否则你不能这样做。

这段代码对我来说看起来不像任何标准的伪随机生成器 - Linux人员有一种“滚动自己”的不幸习惯 - 但无论如何打破它可能是一个可公布的结果。因此，如果它易碎，我怀疑这并不容易。

当然，设计的目的是“不”作为你问题的答案。

[编辑]

当然，从信息论的角度来看，答案是肯定的，因为你无法从有限熵中得到无限的熵。但从信息理论的角度来看，除了一次性填充之外，没有安全的密码。我假设你在询问实际/加密意义。

[编辑2]

一点点搜索出现this paper，声称在Linux的/ dev / urandom中演示了对“前向安全”的攻击。（也就是说，给定生成器的状态，尝试重建早期的状态。）

这就是程序员从不尝试发明自己的密码术的原因。无论你认为自己多么聪明，一些以此为生的以色列学者都会让你看起来很愚蠢。

那就是说，我没有看到对发电机的输出的任何攻击，这就是你要问的。

linux / dev / urandom前向预测

2 个答案: