Question

如何针对一个数字生成具有指定偏差的随机数。例如，我如何在两个数字1和2之间选择，偏向90％的偏差。我能想到的最好的是......

import random

print random.choice([1, 1, 1, 1, 1, 1, 1, 1, 1, 2])

有更好的方法吗？我展示的方法在简单的例子中起作用，但最终我不得不做更复杂的选择，其偏差非常具体（例如偏差为37.65％），这需要很长的列表。

编辑：我应该补充一点，我坚持使用numpy 1.6，所以我不能使用numpy.random.choice。

Answer 1

np.random.choice有一个p参数，可用于指定选择的概率：

np.random.choice([1,2], p=[0.9, 0.1])

Answer 2

如果您只需要一次绘制一个项目，np.random.choice()使用的算法复制起来相对简单。

import numpy as np

def simple_weighted_choice(choices, weights, prng=np.random):
    running_sum = np.cumsum(weights)
    u = prng.uniform(0.0, running_sum[-1])
    i = np.searchsorted(running_sum, u, side='left')
    return choices[i]

Answer 3

对于替换的随机抽样，np.random.choice中的基本代码是

            cdf = p.cumsum()
            cdf /= cdf[-1]
            uniform_samples = self.random_sample(shape)
            idx = cdf.searchsorted(uniform_samples, side='right')

所以我们可以在新函数中使用它做同样的事情（但没有错误检查和其他细节）：

import numpy as np


def weighted_choice(values, p, size=1):
    values = np.asarray(values)

    cdf = np.asarray(p).cumsum()
    cdf /= cdf[-1]

    uniform_samples = np.random.random_sample(size)
    idx = cdf.searchsorted(uniform_samples, side='right')
    sample = values[idx]

    return sample

示例：

In [113]: weighted_choice([1, 2], [0.9, 0.1], 20)
Out[113]: array([1, 1, 1, 1, 1, 2, 1, 1, 1, 2, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1])

In [114]: weighted_choice(['cat', 'dog', 'goldfish'], [0.3, 0.6, 0.1], 15)
Out[114]: 
array(['cat', 'dog', 'cat', 'dog', 'dog', 'dog', 'dog', 'dog', 'dog',
       'dog', 'dog', 'dog', 'goldfish', 'dog', 'dog'], 
      dtype='|S8')

Answer 4

这样的事情应该可以解决问题，并且可以在不创建中间数组的情况下处理所有浮点概率。

import random
from itertools import accumulate  # for python 3.x

def accumulate(l):  # for python 2.x
    tmp = 0
    for n in l:
        tmp += n
        yield tmp

def random_choice(a, p):
    sums = sum(p)
    accum = accumulate(p)  # made a cumulative list of probability
    accum = [n / sums for n in accum]  # normalize
    rnd = random.random()
    for i, item in enumerate(accum):
        if rnd < item:
            return a[i]

Answer 5

容易获得的是概率表中的索引。根据需要为多个权重制作一个表，例如： prb = [0.5, 0.65, 0.8, 1]

使用以下内容获取索引：

 def get_in_range(prb, pointer):
    """Returns index of matching range in table prb"""
    found = 0
    for p in prb:
        if nr>p:
            found += 1
    return found

get_in_range返回的索引可用于指向相应的值表。

使用示例：

import random
values = [1, 2, 3]
weights = [0.9, 0.99, 1]
result = values[get_in_range(prb, random.random())]

应该有95％选择1的概率; 2％含4％，含3％含1％

我如何＆＃34;随机＆＃34;选择具有指定偏差的数字指向特定数字

5 个答案: