这部分我能够矢量化并摆脱嵌套循环。
def EMalgofast(obsdata, beta, pjt):
n = np.shape(obsdata)[0]
g = np.shape(pjt)[0]
zijtpo = np.zeros(shape=(n,g))
for j in range(g):
zijtpo[:,j] = pjt[j]*stats.expon.pdf(obsdata,scale=beta[j])
zijdenom = np.sum(zijtpo, axis=1)
zijtpo = zijtpo/np.reshape(zijdenom, (n,1))
pjtpo = np.mean(zijtpo, axis=0)
我无法对下面的部分进行矢量化。我需要弄清楚
betajtpo_1 = []
for j in range(g):
num = 0
denom = 0
for i in range(n):
num = num + zijtpo[i][j]*obsdata[i]
denom = denom + zijtpo[i][j]
betajtpo_1.append(num/denom)
betajtpo = np.asarray(betajtpo_1)
return(pjtpo,betajtpo)
答案 0 :(得分:0)
我猜测Python并不是你看到的第一种编程语言。我之所以这样说是因为在python中,通常我们不必处理操纵索引。您直接对返回的值或键执行操作。请确保不要将此视为违法行为,我自己也会使用C ++。这很难消除习惯;)。
如果您对性能感兴趣,那么Raymond Hettinger就如何优化和优化Python做了很好的演示: https://www.youtube.com/watch?v=OSGv2VnC0go
至于你需要帮助的代码,这对你有帮助吗?我需要离开,这是不幸的未经考验... 参考: Iterating over a numpy array
http://docs.scipy.org/doc/numpy/reference/generated/numpy.true_divide.html
def EMalgofast(obsdata, beta, pjt):
n = np.shape(obsdata)[0]
g = np.shape(pjt)[0]
zijtpo = np.zeros(shape=(n,g))
for j in range(g):
zijtpo[:,j] = pjt[j]*stats.expon.pdf(obsdata,scale=beta[j])
zijdenom = np.sum(zijtpo, axis=1)
zijtpo = zijtpo/np.reshape(zijdenom, (n,1))
pjtpo = np.mean(zijtpo, axis=0)
betajtpo_1 = []
#manipulating an array of numerator and denominator instead of creating objects each iteration
num=np.zeros(shape=(g,1))
denom=np.zeros(shape=(g,1))
#generating the num and denom real value for the end result
for (x,y), value in numpy.ndenumerate(zijtpo):
num[x],denom[x] = num[x] + value *obsdata[y],denom[x] + value
#dividing all at once after instead of inside the loop
betajtpo_1= np.true_divide(num/denom)
betajtpo = np.asarray(betajtpo_1)
return(pjtpo,betajtpo)
请给我一些反馈!
此致
Eric Lafontaine