线性同余生成器-如何选择种子和统计检验

时间:2019-07-11 08:54:18

标签: python random lcg

我需要做一个线性同余生成器,该生成器将成功通过所选的统计检验。

我的问题是:如何正确选择生成器的编号,以及 我应该选择哪种统计测试?

我想到了:

  1. 卡方频率均匀性测试

    • 每种生成方法收集10,000个数字

    • 将[0.1)细分为10个相等的细分

  2. Kolmogorov-Smirnov均匀性检验

    • 由于K-S测试在使用较小的数字集时效果更好,因此您可以使用卡方频率测试生成的10,000个中的前100个

以下是代码示例:

def seedLCG(initVal):
    global rand
    rand = initVal

def lcg():
    a = 1664525
    c = 1013904223
    m = 2**32
    global rand
    rand = (a*rand + c) % m
    return rand

seedLCG(1)

for i in range(1000):
    print (lcg())

在选择种子时,我在考虑纳秒级,但是我不知道如何实现它,这完全有意义吗?这样做的目的是证明所选种子是随机选择的,而不是从瓶盖中提取的

1 个答案:

答案 0 :(得分:1)

Wrt如何正确选择生成器的编号,Wiki页面上有Hull–Dobell定理的描述,它告诉您如何选择ac来拥有完整周期的生成器。您从数字食谱中获得了数字,据我所知,您将获得完整的[0 ... 2 32 )生成器。或者,您可以查看this paper中的品质因数,有(a,c)对适合任何期望的时间段。

关于测试,请看提供的@pjs论文。

when it comes to choosing seeds, I was thinking about nanoseconds, but I have no idea how to implement it and will it make sense at all? The idea is to show that the selected seeds were chosen randomly and not so much from the cap。我认为这不是一个好主意,因为您不能保证从时间/上限/ ...采摘的种子不会重叠。 LCG基本上是双射的[0 ... 2 32 )<-> [0 ... 2 32 )映射,相对容易重叠随机数流因此您的结果是相关的。

相反,我建议使用LCG的另一个不错的属性-对数向前(和向后)跳过。因此,为了在N内核上进行模拟,您可以选择一个种子并在第一个代码上运行,相同的种子,但第二个内核则跳过(N / 2 32 ),种子和跳过(N / 2 32 * 2),依此类推。

具有显式状态和跳过的LCG代码如下,Win10 x64,Python 3.7 Anaconda

import numpy as np

class LCG(object):

    UZERO: np.uint32 = np.uint32(0)
    UONE : np.uint32 = np.uint32(1)

    def __init__(self, seed: np.uint32, a: np.uint32, c: np.uint32) -> None:
        self._seed: np.uint32 = np.uint32(seed)
        self._a   : np.uint32 = np.uint32(a)
        self._c   : np.uint32 = np.uint32(c)

    def next(self) -> np.uint32:
        self._seed = self._a * self._seed + self._c
        return self._seed

    def seed(self) -> np.uint32:
        return self._seed

    def set_seed(self, seed: np.uint32) -> np.uint32:
        self._seed = seed

    def skip(self, ns: np.int32) -> None:
        """
        Signed argument - skip forward as well as backward

        The algorithm here to determine the parameters used to skip ahead is
        described in the paper F. Brown, "Random Number Generation with Arbitrary Stride,"
        Trans. Am. Nucl. Soc. (Nov. 1994). This algorithm is able to skip ahead in
        O(log2(N)) operations instead of O(N). It computes parameters
        A and C which can then be used to find x_N = A*x_0 + C mod 2^M.
        """

        nskip: np.uint32 = np.uint32(ns)

        a: np.uint32 = self._a
        c: np.uint32 = self._c

        a_next: np.uint32 = LCG.UONE
        c_next: np.uint32 = LCG.UZERO

        while nskip > LCG.UZERO:
            if (nskip & LCG.UONE) != LCG.UZERO:
                a_next = a_next * a
                c_next = c_next * a + c

            c = (a + LCG.UONE) * c
            a = a * a

            nskip = nskip >> LCG.UONE

        self._seed = a_next * self._seed + c_next


#%%
np.seterr(over='ignore')

a = np.uint32(1664525)
c = np.uint32(1013904223)
seed = np.uint32(1)

rng = LCG(seed, a, c)

print(rng.next())
print(rng.next())
print(rng.next())

rng.skip(-3) # back by 3
print(rng.next())
print(rng.next())
print(rng.next())

rng.skip(-3) # back by 3
rng.skip(2) # forward by 2
print(rng.next())

更新

生成1万个数字

np.seterr(over='ignore')

a = np.uint32(1664525)
c = np.uint32(1013904223)
seed = np.uint32(1)

rng = LCG(seed, a, c)
q = [rng.next() for _ in range(0, 10000)]