保存循环内的所有指令

Question

我正在计算for循环的每次迭代的beta PERT分布（除此之外，但分布的计算是花费最多时间的）。最初在R中对此进行了编码，这样做时间太长，因此尝试使用更快的工具。

我的一些数据集可能非常大，例如我只运行了153413个案例，并且在Python中花了大约8小时（优于R但仍然有点长）。

我对Python很陌生，想知道是否有办法加速这样的计算？

示例代码：

af = lambda pmu, pmin, pmode, pmax: (pmu-pmin)*(2*pmode-pmin-pmax)/((pmode-pmu)*(pmax-pmin))
bf = lambda pmu, pmin, pmode, pmax: (pmax-pmu)/(pmu-pmin)*((pmu-pmin)*(2*pmode-pmin-pmax)/((pmode-pmu)*(pmax-pmin)))

e=5.
shape=4.
max=10.
mu_d = np.arange(0, 10, 0.05)                
d = np.arange(0.025, 60.025, 0.05)
nlocs=153413  # number of rows in dataset


f0_dist = np.zeros(len(mu_d))
f1_dist = np.zeros(len(mu_d))
f2_dist = np.zeros(len(mu_d))

f0 = st.norm.cdf(d, 0.9/2., 0.9/6.)
f1 = st.uniform.cdf(d, 0.001, 0.9)

tic = time.clock()     
    for i in xrange(nlocs):
       for j in xrange(len(mu_d)): # mu_d has 121 values
            Rp_min = mu_d[j] - 1.96*e
            Rp_mode = mu_d[j] - 0.75*e
            Rp_max = max
            Rp_mu=(Rp_min+Rp_max+shape*Rp_mode)/(shape+2)
   dist = st.beta.cdf(d, a=af(Rp_mu, Rp_min, Rp_mode, Rp_max), b=bf(Rp_mu, Rp_min, Rp_mode, Rp_max), loc=Rp_min, scale=1-Rp_min)

    f0_dist[j] = 1 - np.sum(dist*f0*0.05)
    f1_dist[j] = 1- np.sum(dist*f1*0.05)
    f2_dist[j] = 1 - np.sum(dist*0.05)
    temp = 0.4*f0_dist + 0.5*f1_dist + 0.1*f1_dist
    aggr_dist = aggr_dist + temp

toc = time.clock() - tic
print '\nTime elapsed: %.3f seconds\n' % toc

Answer 1

这是一个经过修改的代码：

af = lambda pmu, pmin, pmode, pmax: (pmu-pmin)*(2*pmode-pmin-pmax)/((pmode-pmu)*(pmax-pmin))
bf = lambda pmu, pmin, pmode, pmax: (pmax-pmu)/(pmu-pmin)*((pmu-pmin)*(2*pmode-pmin-pmax)/((pmode-pmu)*(pmax-pmin)))

e=5.
shape=4.
max=10.
mu_d = np.arange(0, 10, 0.05)                
d = np.arange(0.025, 60.025, 0.05)

Rp_max = max
e1_96 = 1.96 * e
e0_75 = 0.75 * e
for i in xrange(nlocs): # e.g 153413
   for mu_d_j in mu_d: # mu_d has 121 values
        Rp_min = mu_d_j - e1_96
        Rp_mode = mu_d_j - e0_75
        Rp_mu=(Rp_min+Rp_max+shape*Rp_mode)/(shape+2)

   dist = st.beta.cdf(d, a=af(Rp_mu, Rp_min, Rp_mode, Rp_max), b=bf(Rp_mu, Rp_min, Rp_mode, Rp_max), loc=Rp_min, scale=1-Rp_min)

说明如下：

保存循环内的所有指令

将Rp_max = max移出循环
preaclulate常量不在循环中（e1_96和e0_75）

避免更深入的引用

只执行一次mu_d[j]并使用局部变量，获得更深层的值花费时间

使用`for`循环来获取值而不是`lst[i]`

下列的程序：

for j in xrange(len(mu_d)): # mu_d has 121 values
    mu_d_j = mu_d[j]

应该变得更有效率（和Pythonic）：

for mu_d_j in mu_d: # mu_d has 121 values
    #now use mu_d_j

测量时间

这是基本规则，应评估每项修改。如果你设定了你的期望速度（处理时间），你很快就会停止优化够了。

声明

由于我无法运行代码，因此无法保证所有更改都是正确的。有几行，我不确定，他们将做什么：

最后一行`dist =`

   dist = st.beta.cdf(d, a=af(Rp_mu, Rp_min, Rp_mode, Rp_max), b=bf(Rp_mu, Rp_min, Rp_mode, Rp_max), loc=Rp_min, scale=1-Rp_min)

它是否正确缩进？就像现在一样，它为每个nloc循环执行一次。

使用的结果dist值在哪里？

如果它是最深周期的一部分，则可以进行更少的优化（使用较少的变量）名字内联移动一些代码。）

加快Python中beta Pert分布的计算

1 个答案:

保存循环内的所有指令

避免更深入的引用

使用`for`循环来获取值而不是`lst[i]`

测量时间

声明

最后一行`dist =`

加快Python中beta Pert分布的计算

1 个答案:

保存循环内的所有指令

避免更深入的引用

使用for循环来获取值而不是lst[i]

测量时间

声明

最后一行dist =

使用`for`循环来获取值而不是`lst[i]`

最后一行`dist =`