Question

我想从2D Numpy数组的索引中进行采样，考虑到每个索引都是按该数组内部的数字加权的。我知道它的方式是使用numpy.random.choice但是它不返回索引而是返回数字本身。有没有有效的方法呢？

这是我的代码：

import numpy as np
A=np.arange(1,10).reshape(3,3)
A_flat=A.flatten()
d=np.random.choice(A_flat,size=10,p=A_flat/float(np.sum(A_flat)))
print d

Answer 1

扩展我的评论：调整此处显示的加权选择方法https://stackoverflow.com/a/10803136/553404

def weighted_choice_indices(weights):
    cs = np.cumsum(weights.flatten())/np.sum(weights)
    idx = np.sum(cs < np.random.rand())
    return np.unravel_index(idx, weights.shape)

Answer 2

您可以执行以下操作：

import numpy as np

def wc(weights):
    cs = np.cumsum(weights)
    idx = cs.searchsorted(np.random.random() * cs[-1], 'right')
    return np.unravel_index(idx, weights.shape)

请注意，cumsum是最慢的部分，所以如果你需要为同一个数组重复执行此操作，我建议提前计算cumsum并重复使用它。

从Numpy数组的索引中采样的有效方法？

2 个答案: