根据条件创建具有二项式值(最省时)的数组

时间:2018-12-21 02:29:00

标签: python arrays numpy time

当前,我正在从事一个项目,该项目涉及创建一个包含10个二项式值0和1以及给定成功率(= ci_rate [i] / 1'000)的数组。

由于10年中每一年的比率不同,我运行了10次循环,每次创建20'000个二项式值(对于20000个情况)。

二项式值的成功率很小,但是在接下来的几年中是一个吸收状态。仅简化了10个场景和10年,我想输出以下内容:

[1,0,0,0,0,0,0,0,0,0]
[1,0,0,0,0,0,0,1,0,0]
[1,0,0,1,0,0,0,1,0,0]
[1,0,0,1,0,0,0,1,0,0]
[1,0,0,1,0,0,0,1,0,0]
[1,0,0,1,0,0,0,1,0,0]
[1,0,0,1,0,1,0,1,0,0]
[1,0,0,1,0,1,0,1,0,0]
[1,0,0,1,0,1,0,1,0,0]
[1,0,0,1,0,1,0,1,0,0]

目前,我正在以这种方式解决问题:

for j in range(20000):
    tem = np.zeros(len(ci_rate))
    for i in range(len(ci_rate)):
        if i == 0:
            tem[0] = (np.random.binomial(1, p = ci_rate[i] / 1000))
        else:
            tem[i]= int(np.where(tem[i-1]==1, 1, np.random.binomial(1, p = ci_rate[i] / 1000)))

    ci_sim.append(tem)

有足够的创意来解决此问题吗?

3 个答案:

答案 0 :(得分:3)

此解决方案首先忽略持久性规则,然后使用maximum.accumulate强制执行。

ci_rate = np.random.uniform(0, 0.1, 10)
res = np.maximum.accumulate(np.random.random((20000, ci_rate.size))<ci_rate, axis=1).view(np.int8)
res[:20]
# 
# array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
#        [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
#        [0, 0, 0, 0, 0, 0, 0, 0, 0, 1],
#        [0, 0, 0, 0, 0, 1, 1, 1, 1, 1],
#        [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
#        [0, 0, 0, 0, 0, 0, 0, 0, 0, 1],
#        [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
#        [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
#        [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
#        [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
#        [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
#        [0, 1, 1, 1, 1, 1, 1, 1, 1, 1],
#        [0, 0, 0, 0, 0, 1, 1, 1, 1, 1],
#        [0, 0, 0, 0, 0, 0, 1, 1, 1, 1],
#        [0, 0, 0, 1, 1, 1, 1, 1, 1, 1],
#        [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
#        [0, 1, 1, 1, 1, 1, 1, 1, 1, 1],
#        [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
#        [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
#        [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]], dtype=int8)

答案 1 :(得分:2)

我的尝试是:

import numpy as np
ci_rate = np.random.normal(size=20)
ci_rate = (ci_rate - min(ci_rate)) /(max(ci_rate) - min(ci_rate)) - 0.7
ci_rate[ci_rate < 0] = 0
r = []
for i in range(100):
    t = np.random.binomial(1, ci_rate)
    r += [t.tolist()]
    ci_rate = [1 if j == 1 else i for i, j in zip(ci_rate, t)]


#output 

[[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0],

答案 2 :(得分:0)

我建议使用geometric distribution,因为您似乎正在尝试查看首次成功的试验次数。

我正在根据计算时间来比较使用地理分布的有用性

EDIT:
%%timeit
ci_rate = np.random.uniform(0, 0.1, nb_years)
successful_trail = np.random.geometric(p=ci_rate)

ci_sim=np.zeros((nb_scenarios,nb_years))
for i in range(nb_years):
    ci_sim[i,successful_trail[i]:]=1

## 10000 loops, best of 3: 41.4 µs per loop

%%timeit
ci_rate = np.random.uniform(0, 0.1, nb_years)
res = np.maximum.accumulate(np.random.random((nb_scenarios, ci_rate.size))<ci_rate, axis=1).view(np.int8)

## 100 loops, best of 3: 2.97 ms per loop