在过去的几天里,我一直试图摆脱FORTRAN的敏感性,并接受python的驱逐,尽可能地摆脱尽可能多的循环并优化我的代码。
本网站上的一些帖子在实现这一目标方面非常有用,但我遇到了一个我不确定如何解决的问题。
下面是代码的“for
- 循环版本”,我承认,它使用了一些不必要的数组分配,但这只是为了说明问题:
mu = np.zeros( nbk )
mubins = np.linspace( -1, 1, nbk )
mu[:-1] = ( mubins[:-1] + mubins[1:] ) / 2.
kbins = 10**( np.linspace( kmin, kmax, nb ) )
k1 = np.zeros( nbk )
k1[:-1] = ( kbins[:-1] + kbins[1:] ) / 2.0
nb = 100
for i in range( nb - 1 ):
for j in range( nb - 1 ):
Bl = np.zeros( Nmodes ) # ( will be important later ) initialising array here
for k in range( nb - 1 ):
k33[i,j,k] = np.sqrt( k1[i] * k1[i] + k1[j] * k1[j] - 2 * k1[i] * k1[j] * mu[k] )
P11[i,j,k] = pkspline( k1[i] ) # just using intrep1d from earlier in the code - not important
P22[i,j,k] = pkspline( k1[j] )
P33[i,j,k] = pkspline( k33[i,j,k] )
f212[i,j,k],s212[i,j,k] = S2F2_SLOW(k1[i],k1[j],k33[i,j,k]) # just calling some function - not important
f213[i,j,k],s213[i,j,k] = S2F2_SLOW(k1[i],k33[i,j,k],k1[j])
f223[i,j,k],s223[i,j,k] = S2F2_SLOW(k1[j],k33[i,j,k],k1[i])
# computing B11 to be used in following ‘p’ loop
B11=b1*b1*b1*P11[i,j,k]*P22[i,j,k]*2.*f212[i,j,k] + b1**2*b2*P11[i,j,k]*P22[i,j,k] + b1**2*bs2*P11[i,j,k]*P22[i,j,k]*s212[i,j,k] + b1*b1*b1*P11[i,j,k]*P33[i,j,k]*2.*f213[i,j,k] + b1**2*b2*P11[i,j,k]*P33[i,j,k] + b1**2*bs2*P11[i,j,k]*P33[i,j,k]*s213[i,j,k] + b1*b1*b1*P22[i,j,k]*P33[i,j,k]*2.*f223[i,j,k] + b1**2*b2*P22[i,j,k]*P33[i,j,k] + b1**2*bs2*P22[i,j,k]*P33[i,j,k]*s223[i,j,k]
# new loop ( this is where my issue is ) v-v-v-v-v-v-v-v-v-v-v-v-v-v-v
for p in range( Nmodes ):
Bl[p] = Bl[p] + 2. * pi * LegMu[k,p] * dmu * B11
这就是代码片段。当对此进行审理时,似乎可以直接删除
外部“ i
”和“ j
”for循环,内部“ k
”和' p
'循环。
以下是我尝试这样做的事情:
kbins = 10**(np.linspace(kmin,kmax,nb))
kk = np.zeros(nbk)
kk[:-1] = (kbins[:-1]+kbins[1:])/2.0
# so from above i now create 2 new arrays that will replace k1[i] and k1[j] in the previous version
k1 = kk[np.newaxis].T #equivalent to k1[i]
k2 = kk #equivalent to k1[j]
#i and j loops now removed and left with k ( i may be able to get rid of the 'k' loop as well but i can't see how)
for k in range(nbk-1):
k3[:-1,:-1,k]= np.sqrt(np.square(k2[:-1]) + np.square(k1[:-1]) -2*k1[:-1]*k2[:-1]*mu[k])
print k
P1[:-1,:-1,k]=pkspline(k1[:-1])
P2[:-1,:-1,k]=pkspline(k2[:-1])
P3[:-1,:-1,k]=pkspline(k3[:-1,:-1,k])
F2_12[:-1,:-1,k],S2_12[:-1,:-1,k]=S2F2(k1[:-1],k2[:-1],k3[:-1,:-1,k])
F2_13[:-1,:-1,k],S2_13[:-1,:-1,k]=S2F2(k1[:-1],k3[:-1,:-1,k],k2[:-1])
F2_23[:-1,:-1,k],S2_23[:-1,:-1,k]=S2F2(k2[:-1],k3[:-1,:-1,k],k1[:-1])
#i've now put BB into a function.
B11[:-1,:-1,k] = BB(b1,b2,bs2,P1[:-1,:-1,k],P2[:-1,:-1,k],P3[:-1,:-1,k],S2_12[:-1,:-1,k],S2_13[:-1,:-1,k],S2_23[:-1,:-1,k],F2_12[:-1,:-1,k],F2_13[:-1,:-1,k],F2_23[:-1,:-1,k])
我从 B
循环中取出 k
数组,然后写道:
B11 = BB( b1,b2,bs2,P1,P2,P3,S2_12,S2_13,S2_23,F2_12,F2_13,F2_23 )
然而,我似乎无法理解的是如何从中继续并加入 p
循环,因为这是嵌套在 k
循环中:
for p in range( Nmodes ):
Bl[p] = Bl[p] + 2. * pi * LegMu[k,p] * dmu * B11
至关重要,如果您查看第一个版本,我必须在 Bl
之前将 k
数组设置为零/ strong>循环被调用。在此之后会发生使用 Bl
的内容,但这是我现在被困住的地方。
非常感谢任何帮助!
好的,按照要求,我将简化上述内容,以便更好地说明问题的机制。你可以忽略我指定的数组值 - 它只是一个例子:
首先是“for
- 循环版”......
kbins = linspace( -1, 1, 100 ) )
mubins = linspace( -5, 5, 100 ) )
nb = 100
BLB = np.zeros( 10 ) # <--------------------------------- see q loop
for i in range( nb - 1 ):
k1 = ( kbins[i] + kbins[i+1] ) / 2.0
for j in range( nb - 1 ):
k2 = ( kbins[j] + kbins[j+1] ) / 2.0
BL = np.zeros( 10 ) # <---------------------------------- see 'p' loop
for k in range( nb - 1 ):
mu = ( mubins[k] + mubins[k+1] ) / 2.
k3 = np.sqrt( k1 + k2 - 2 * mu )
x = some_function(k1,k2,k3)
y = some_function(k1,k3,k2)
z = some_function(k2,k3,k1)
B = x + y + z
for p in range( 10 ):
BL[p] = BL[p] + 2. * B
for q in range( 10 ):
BLB[q] = BLB[q] + BL[q]
所以我尝试将其设置为 p
循环,如下所示:
kbins = linspace( -1, 1, 100 ) ) # as before
# i now define k1 and k2 here as vectors, and not scalars as was above example
kk = np.zeros( nbk )
kk[:-1] = ( kbins[:-1] + kbins[1:] ) / 2.0
k1 = kk[np.newaxis].T # equivalent to k1 in above
k2 = kk # equivalent to k2 in above
mu[:-1] = ( mubins[:-1] + mubins[1:] ) /2. # mu is now an array as well
for k in range( nbk - 1 ): # i,j-loops removed,left with k
k3[:-1,:-1,k] = np.sqrt( k1[:-1] + k2[:-1] - 2 * mu[k] ) # k3 is now an array
x[ :-1,:-1,k] = some_function(k1[:-1],k2[:-1],k3[:-1,:-1,k]) # x,y,z now arrays to allow looping over elements
y[ :-1,:-1,k] = some_function(k1[:-1],k3[:-1,:-1,k],k2[:-1])
z[ :-1,:-1,k] = some_function(k2[:-1],k3[:-1,:-1,k],k1[:-1])
B11[:-1,:-1,k]= x[:-1,:-1,k] + y[:-1,:-1,k] + z[:-1,:-1,k]
但如何在相应的 {{1}中计算 BL
和 BLB
} 和 p
循环?
我希望这更有意义。
答案 0 :(得分:0)
看起来some_function
需要3个标量并返回一个标量。 B
也是标量。所以
B = x + y + z
for p in range( 10 ):
BL[p] = BL[p] + 2. * B
for q in range( 10 ):
BLB[q] = BLB[q] + BL[q]
可以简化为:
BL += 2*B
BLB += BL
我没有看到重复p
和q
的重点。如上所述,BL
的所有10个值都是相同的,BLB
的10个值也是如此。
当我尝试运行您的脚本时(使用some_function
这样的简单k1+k2+k3
),我在k3 = np.sqrt( k1 + k2 - 2 * mu )
收到了一个间歇性错误,可能是因为k1+k2-2*mu
会变得消极。但忽略这一点,我认为你的脚本简化为:
nb = 11
kbins = np.linspace( -1, 1, nb )
mubins = np.linspace( -5, 5, nb )
kx = (kbins[:-1]+kbins[1:])/2.0
kmx = (mubins[:-1]+mubins[1:])/2.0
for k1 in kx:
BLB = 0
for k2 in kx:
BL = 0
for mu in kmx:
B = k1+k2-2*mu
BL += 2*B
BLB += BL
print BLB
在这里,我玩弄了BL
和BLB
总结的嵌套,所以他们有所了解。
如果内部标量计算可以推广到采用3D数组,我们可以对整个事物进行矢量化。
k1 = kx[:, None, None]
k2 = kx[None, :, None]
mu = kmx[None, None,:]
B = 2*(k1 + k2 - 2*mu)
# (nb-1, nb-1, nb-1)
BL = np.sum(B, axis=2)
BLB = np.sum(BL, axis=1)
print BLB
回顾第一个脚本,p
循环有点复杂
for p in range( Nmodes ):
Bl[p] += X[k,p] * B
应该起作用(假设X
是某种2d系数数组):
Bl += X[k,:] * B