当前,我正在从事一个项目,该项目涉及创建一个包含10个二项式值0和1以及给定成功率(= ci_rate [i] / 1'000)的数组。
由于10年中每一年的比率不同,我运行了10次循环,每次创建20'000个二项式值(对于20000个情况)。
二项式值的成功率很小,但是在接下来的几年中是一个吸收状态。仅简化了10个场景和10年,我想输出以下内容:
[1,0,0,0,0,0,0,0,0,0]
[1,0,0,0,0,0,0,1,0,0]
[1,0,0,1,0,0,0,1,0,0]
[1,0,0,1,0,0,0,1,0,0]
[1,0,0,1,0,0,0,1,0,0]
[1,0,0,1,0,0,0,1,0,0]
[1,0,0,1,0,1,0,1,0,0]
[1,0,0,1,0,1,0,1,0,0]
[1,0,0,1,0,1,0,1,0,0]
[1,0,0,1,0,1,0,1,0,0]
目前,我正在以这种方式解决问题:
for j in range(20000):
tem = np.zeros(len(ci_rate))
for i in range(len(ci_rate)):
if i == 0:
tem[0] = (np.random.binomial(1, p = ci_rate[i] / 1000))
else:
tem[i]= int(np.where(tem[i-1]==1, 1, np.random.binomial(1, p = ci_rate[i] / 1000)))
ci_sim.append(tem)
有足够的创意来解决此问题吗?
答案 0 :(得分:3)
此解决方案首先忽略持久性规则,然后使用maximum.accumulate
强制执行。
ci_rate = np.random.uniform(0, 0.1, 10)
res = np.maximum.accumulate(np.random.random((20000, ci_rate.size))<ci_rate, axis=1).view(np.int8)
res[:20]
#
# array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
# [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
# [0, 0, 0, 0, 0, 0, 0, 0, 0, 1],
# [0, 0, 0, 0, 0, 1, 1, 1, 1, 1],
# [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
# [0, 0, 0, 0, 0, 0, 0, 0, 0, 1],
# [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
# [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
# [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
# [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
# [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
# [0, 1, 1, 1, 1, 1, 1, 1, 1, 1],
# [0, 0, 0, 0, 0, 1, 1, 1, 1, 1],
# [0, 0, 0, 0, 0, 0, 1, 1, 1, 1],
# [0, 0, 0, 1, 1, 1, 1, 1, 1, 1],
# [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
# [0, 1, 1, 1, 1, 1, 1, 1, 1, 1],
# [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
# [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
# [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]], dtype=int8)
答案 1 :(得分:2)
我的尝试是:
import numpy as np
ci_rate = np.random.normal(size=20)
ci_rate = (ci_rate - min(ci_rate)) /(max(ci_rate) - min(ci_rate)) - 0.7
ci_rate[ci_rate < 0] = 0
r = []
for i in range(100):
t = np.random.binomial(1, ci_rate)
r += [t.tolist()]
ci_rate = [1 if j == 1 else i for i, j in zip(ci_rate, t)]
#output
[[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0],
答案 2 :(得分:0)
我建议使用geometric distribution,因为您似乎正在尝试查看首次成功的试验次数。
我正在根据计算时间来比较使用地理分布的有用性
EDIT:
%%timeit
ci_rate = np.random.uniform(0, 0.1, nb_years)
successful_trail = np.random.geometric(p=ci_rate)
ci_sim=np.zeros((nb_scenarios,nb_years))
for i in range(nb_years):
ci_sim[i,successful_trail[i]:]=1
## 10000 loops, best of 3: 41.4 µs per loop
%%timeit
ci_rate = np.random.uniform(0, 0.1, nb_years)
res = np.maximum.accumulate(np.random.random((nb_scenarios, ci_rate.size))<ci_rate, axis=1).view(np.int8)
## 100 loops, best of 3: 2.97 ms per loop