Question

这部分我能够矢量化并摆脱嵌套循环。

def EMalgofast(obsdata, beta, pjt):
     n = np.shape(obsdata)[0]
     g = np.shape(pjt)[0]
     zijtpo = np.zeros(shape=(n,g))
     for j in range(g):
         zijtpo[:,j] = pjt[j]*stats.expon.pdf(obsdata,scale=beta[j])

     zijdenom = np.sum(zijtpo, axis=1)
     zijtpo = zijtpo/np.reshape(zijdenom, (n,1))

     pjtpo = np.mean(zijtpo, axis=0)

我无法对下面的部分进行矢量化。我需要弄清楚

     betajtpo_1 = []
     for j in range(g):
         num = 0
         denom = 0
         for i in range(n):
             num = num + zijtpo[i][j]*obsdata[i]
             denom = denom + zijtpo[i][j]
         betajtpo_1.append(num/denom)

     betajtpo = np.asarray(betajtpo_1)

     return(pjtpo,betajtpo)

Answer 1

我猜测Python并不是你看到的第一种编程语言。我之所以这样说是因为在python中，通常我们不必处理操纵索引。您直接对返回的值或键执行操作。请确保不要将此视为违法行为，我自己也会使用C ++。这很难消除习惯;）。

如果您对性能感兴趣，那么Raymond Hettinger就如何优化和优化Python做了很好的演示： https://www.youtube.com/watch?v=OSGv2VnC0go

至于你需要帮助的代码，这对你有帮助吗？我需要离开，这是不幸的未经考验... 参考： Iterating over a numpy array

http://docs.scipy.org/doc/numpy/reference/generated/numpy.true_divide.html

 def EMalgofast(obsdata, beta, pjt):
     n = np.shape(obsdata)[0]
     g = np.shape(pjt)[0]
     zijtpo = np.zeros(shape=(n,g))
     for j in range(g):
         zijtpo[:,j] = pjt[j]*stats.expon.pdf(obsdata,scale=beta[j])

     zijdenom = np.sum(zijtpo, axis=1)
     zijtpo = zijtpo/np.reshape(zijdenom, (n,1))

     pjtpo = np.mean(zijtpo, axis=0)
     betajtpo_1 = []

     #manipulating an array of numerator and denominator instead of creating objects each iteration
     num=np.zeros(shape=(g,1))
     denom=np.zeros(shape=(g,1))
     #generating the num and denom real value for the end result
     for (x,y), value in numpy.ndenumerate(zijtpo):
         num[x],denom[x] = num[x] + value *obsdata[y],denom[x] + value 

     #dividing all at once after instead of inside the loop
     betajtpo_1= np.true_divide(num/denom)

     betajtpo = np.asarray(betajtpo_1)

     return(pjtpo,betajtpo)

请给我一些反馈！

此致

Eric Lafontaine

我需要对以下内容进行矢量化，以便代码可以更快地运行

1 个答案: