Question

我想生成一个随机泊松数分布，其中生成的数之和为1000，分布的下上限为（3-30）。

我可以使用numpy生成随机数：

 
In [2]: np.random.poisson(5, 150)
array([ 4,  4,  6,  4,  8,  6,  4,  2,  6,  8,  8,  8,  1,  4,  3,  4,  1,
        3,  7,  6,  7,  4,  5,  5,  7,  6,  5,  3,  3,  5,  4,  6,  2,  0,
        3,  5,  6,  2,  5,  2,  4,  7,  4,  7,  8,  5,  6,  1,  4,  4,  7,
        4,  7,  2,  7,  4,  3,  8, 10,  2,  5,  7,  6,  3,  5,  7,  8,  5,
        4,  7,  8,  8,  2,  2, 10,  6,  3,  5,  2,  5,  5,  6,  4,  6,  4,
        0,  4,  3,  5,  8,  6,  7,  4,  4,  4,  3,  3,  4,  4,  6,  7,  6,
        3,  9,  7,  7,  4,  5,  2,  4,  3,  6,  5,  6,  3,  6,  8,  9,  6,
        3,  4,  4,  7,  3,  9, 12,  4,  5,  5,  7,  6,  5,  2, 10,  1,  3,
        4,  4,  6,  5,  4,  4,  7,  5,  6,  5,  7,  2,  5,  5])

但是，我想添加更多内容：

- The random number should be minimal of 3 and max of 30 
- The sum of the generated random number should be 1000.

我知道，如果我进行操作，可能不会创建精确的泊松分布。但是，我想要类似Poisson的东西，但建议使用控件。

Answer 1

让我写一些行不通的东西，我们拭目以待

泊松分布的性质是，一个参数-λ是同时测量均值和方差的度量。让我们尝试另一种分布，该分布实际上总计为1000，足够接近Poisson。

我会尝试JSFiddle Demo。我们考虑从多项式中采样200个数字。我们将每个采样数移动3，因此满足了最小边界条件。这意味着对于采样的多项式总和（n参数）等于1000-3 * 200 =400。概率p _i将设置为1/200。

因此，对于多项式平均值E [x _i] = np _i = 400/200 =2。多项式的方差为= np _i（1-p _i），并且由于p _i很小，所以项（1-p _i）非常接近为1，因此使采样整数类似于Poisson，均值等于方差。问题是，移位平均值为5之后，方差保持在〜2。

无论如何，一些代码。

import numpy as np

N = 200
shift = 3
n = 1000 - N*shift
p = [1.0 / float(N)] * N

q = np.random.multinomial(n, p, size=1)
print(np.sum(q))
print(np.mean(q))
print(np.var(q))

result = q + shift
print(np.sum(result))
print(np.mean(result))
print(np.var(result))

Answer 2

这是另一种选择，基于预先分配每个箱的最小值，计算剩余的观测数，并为每个剩余的箱拨泊松率，该泊松速率由多少个观测和剩余的箱确定，但要接受/拒绝是基于每个垃圾箱的上限。

由于泊松是对一个时间间隔内观察到的观测值的计数，因此，如果不是在初始阶段就分配了所有观测值，则将它们随机分配给具有剩余容量的垃圾箱。

这里是：

import numpy as np

def make_poissonish(n, num_bins):
    if n > 30 * num_bins:
        print("requested n exceeds 30 / bin")
        exit(-1)
    if n < 3 * num_bins:
        print("requested n cannot fill 3 / bin")
        exit(-1)

    # Disperse minimum quantity per bin in all bins, then determine remainder
    lst = [3 for _ in range(num_bins)]
    number_remaining = n - num_bins * 3

    # Allocate counts to all bins using a truncated Poisson
    for i in range(num_bins):
        # dial the rate up or down depending on whether we're falling
        # behind or getting ahead in allocating observations to bins
        rate = number_remaining / float(num_bins - i)  # avg per remaining bin

        # keep generating until we meet the constraint requirement (acceptance/rejection)
        while True:
            x = np.random.poisson(rate)
            if x <= 27 and x <= number_remaining: break
        # Found an acceptable count, put it in this bin and move on
        lst[i] += x
        number_remaining -= x

    # If there are still observations remaining, disperse them
    # randomly across bins that have remaining capacity
    while number_remaining > 0:
        i = np.random.randint(0, num_bins)
        if lst[i] >= 30:    # not this one, it's already full!
            continue
        lst[i] += 1
        number_remaining -= 1
    return lst

示例输出：

result = make_poissonish(150, 10)
print(result)                    # => [16, 19, 11, 16, 21, 18, 12, 17, 8, 12]
print(sum(result))               # => 150

result = make_poissonish(50, 10)
print(result)                    # => [3, 5, 5, 4, 3, 3, 15, 3, 6, 3]
print(sum(result))               # => 50

Answer 3

您可以使用while循环和随机模块轻松完成此操作，它将完成工作：

from random import randint
nums_sum = 0
nums_lst = list()
while nums_sum < 1000:
    n = randint(3, 31)
    nums_sum += n
    nums_lst.append(str(n))
    print(nums_sum)
    if 1000-nums_sum > 30: # means if the sum is more than 30 then complete ..
        continue
    else:
        nums_sum += 1000-nums_sum
print(nums_sum)
print(nums_lst)

那么简单。

使用总和为常数（C）的N个随机数创建类似于Poisson的分布

3 个答案: