Question

我想模拟一个可以在0和1之间取值的变量。但是我也希望这个随机变量有80％的零。目前我正在做以下事情：

data['response']=np.random.uniform(0,1,15000)#simulate response
data['response']=data['response'].apply(lambda x:0 if x<0.85 else x)

但这只会导致变量中的极值（0和.8+）。我希望有80％的零和20％的行，其值介于0和1之间。这必须随机完成。

Answer 1

这是另一个使用numpy.random.shuffle

的人

O(n + log(c))

Answer 2

这是np.random.choice的一种方法，适用于此处，其可选输入参数replace设置为False or 0，以在{{1}的整个长度生成唯一索引然后使用15000生成那些随机数并分配。

因此，实现将沿着这些方向发展 -

np.random.uniform

示例运行结果 -

# Parameters
s = 15000 # Length of array
zeros_ratio = 0.8 # Ratio of zeros expected in the array

out = np.zeros(s) # Initialize output array
nonzeros_count = int(np.rint(s*(1-zeros_ratio))) # Count of nonzeros in array

# Generate unique indices where nonzeros are to be placed
idx = np.random.choice(s, nonzeros_count, replace=0)

# Generate nonzeros between 0 and 1
nonzeros_num = np.random.uniform(0,1,nonzeros_count)

# Finally asssign into those unique positions
out[idx] = nonzeros_num

Answer 3

建立你的代码，你可以在大于0.8时缩放x：

lambda x: 0 if x < 0.8 else 5 * (x - 0.8)

Answer 4

我们可以从扩展到负面的均匀分布中绘制数字，然后将max取为零：

>>> numpy.maximum(0, numpy.random.uniform(-4, 1, 15000))
array([ 0.57310319,  0.        ,  0.02696571, ...,  0.        ,
        0.        ,  0.        ])
>>> a = _
>>> sum(a <= 0)
12095
>>> sum(a > 0)
2905
>>> 12095 / 15000
0.8063333333333333

这里使用-4因为4 /（4 + 1）= 80％。

由于结果是稀疏数组，因此SciPy sparse matrix更合适。

>>> a = scipy.sparse.rand(1, 15000, 0.2)
>>> a.toarray()
array([[ 0.        ,  0.03971366,  0.        , ...,  0.        ,
         0.        ,  0.9252341 ]])

这里0.2 = 1 - 0.8是阵列的密度。非零数字均匀分布在0和1之间。

随机生成更多比例的零python

4 个答案: