python / numpy:vectorize嵌套for循环

时间:2014-10-11 12:10:21

标签: python for-loop numpy vectorization nested-loops

在过去的几天里,我一直试图摆脱FORTRAN的敏感性,并接受python的驱逐,尽可能地摆脱尽可能多的循环并优化我的代码。

本网站上的一些帖子在实现这一目标方面非常有用,但我遇到了一个我不确定如何解决的问题。

下面是代码的“for - 循环版本”,我承认,它使用了一些不必要的数组分配,但这只是为了说明问题:

mu       = np.zeros( nbk )
mubins   = np.linspace( -1, 1, nbk )
mu[:-1]  = ( mubins[:-1] + mubins[1:] ) / 2.

kbins    = 10**( np.linspace( kmin, kmax, nb ) )
k1       = np.zeros( nbk )
k1[:-1]  = ( kbins[:-1] + kbins[1:] ) / 2.0

nb       = 100
for i in range(         nb - 1 ):
    for j in range(     nb - 1 ):
        Bl = np.zeros( Nmodes )                 # ( will be important later ) initialising array here 
        for k in range( nb - 1 ):

            k33[i,j,k] = np.sqrt(  k1[i] * k1[i] + k1[j] * k1[j] - 2 * k1[i] * k1[j] * mu[k] )
            P11[i,j,k] = pkspline( k1[i] )      # just using intrep1d from earlier in the code - not important
            P22[i,j,k] = pkspline( k1[j] )
            P33[i,j,k] = pkspline( k33[i,j,k] )

            f212[i,j,k],s212[i,j,k] = S2F2_SLOW(k1[i],k1[j],k33[i,j,k]) # just calling some function - not important
            f213[i,j,k],s213[i,j,k] = S2F2_SLOW(k1[i],k33[i,j,k],k1[j])
            f223[i,j,k],s223[i,j,k] = S2F2_SLOW(k1[j],k33[i,j,k],k1[i])

             # computing B11 to be used in following ‘p’ loop
            B11=b1*b1*b1*P11[i,j,k]*P22[i,j,k]*2.*f212[i,j,k] + b1**2*b2*P11[i,j,k]*P22[i,j,k] + b1**2*bs2*P11[i,j,k]*P22[i,j,k]*s212[i,j,k] + b1*b1*b1*P11[i,j,k]*P33[i,j,k]*2.*f213[i,j,k] + b1**2*b2*P11[i,j,k]*P33[i,j,k] + b1**2*bs2*P11[i,j,k]*P33[i,j,k]*s213[i,j,k] + b1*b1*b1*P22[i,j,k]*P33[i,j,k]*2.*f223[i,j,k] + b1**2*b2*P22[i,j,k]*P33[i,j,k] + b1**2*bs2*P22[i,j,k]*P33[i,j,k]*s223[i,j,k]

            # new loop ( this is where my issue is ) v-v-v-v-v-v-v-v-v-v-v-v-v-v-v
            for p in range( Nmodes ):
                Bl[p] = Bl[p] + 2. * pi * LegMu[k,p] * dmu * B11

这就是代码片段。当对此进行审理时,似乎可以直接删除 外部“ i ”和“ j ”for循环,内部“ k ”和' p '循环。

以下是我尝试这样做的事情:

kbins = 10**(np.linspace(kmin,kmax,nb))
kk  = np.zeros(nbk)
kk[:-1]  = (kbins[:-1]+kbins[1:])/2.0

# so from above i now create 2 new arrays that will replace k1[i] and k1[j] in the previous version 
k1 = kk[np.newaxis].T #equivalent to k1[i]
k2 = kk               #equivalent to  k1[j]

#i and j loops now removed and left with k ( i may be able to get rid of the 'k' loop as well but i can't see how)
for k in range(nbk-1):
    k3[:-1,:-1,k]=   np.sqrt(np.square(k2[:-1]) + np.square(k1[:-1]) -2*k1[:-1]*k2[:-1]*mu[k])
    print k
    P1[:-1,:-1,k]=pkspline(k1[:-1])
    P2[:-1,:-1,k]=pkspline(k2[:-1])
    P3[:-1,:-1,k]=pkspline(k3[:-1,:-1,k])

    F2_12[:-1,:-1,k],S2_12[:-1,:-1,k]=S2F2(k1[:-1],k2[:-1],k3[:-1,:-1,k])
    F2_13[:-1,:-1,k],S2_13[:-1,:-1,k]=S2F2(k1[:-1],k3[:-1,:-1,k],k2[:-1])
    F2_23[:-1,:-1,k],S2_23[:-1,:-1,k]=S2F2(k2[:-1],k3[:-1,:-1,k],k1[:-1])

    #i've now put BB into a function. 
    B11[:-1,:-1,k] = BB(b1,b2,bs2,P1[:-1,:-1,k],P2[:-1,:-1,k],P3[:-1,:-1,k],S2_12[:-1,:-1,k],S2_13[:-1,:-1,k],S2_23[:-1,:-1,k],F2_12[:-1,:-1,k],F2_13[:-1,:-1,k],F2_23[:-1,:-1,k])

我从 B 循环中取出 k 数组,然后写道:

B11 = BB( b1,b2,bs2,P1,P2,P3,S2_12,S2_13,S2_23,F2_12,F2_13,F2_23 )

然而,我似乎无法理解的是如何从中继续并加入 p 循环,因为这是嵌套在 k 循环中:

for p in range( Nmodes ):
      Bl[p] = Bl[p] + 2. * pi * LegMu[k,p] * dmu * B11

至关重要,如果您查看第一个版本,我必须在 Bl之前将 k 数组设置为零/ strong>循环被调用。在此之后会发生使用 Bl 的内容,但这是我现在被困住的地方。

非常感谢任何帮助!


好的,按照要求,我将简化上述内容,以便更好地说明问题的机制。你可以忽略我指定的数组值 - 它只是一个例子:

首先是“for - 循环版”......

kbins  = linspace( -1, 1, 100 ) )
mubins = linspace( -5, 5, 100 ) )

nb     = 100

BLB    = np.zeros( 10 )           # <--------------------------------- see q loop

for i in range(         nb - 1 ):
    k1 = ( kbins[i] + kbins[i+1] ) / 2.0

    for j in range(     nb - 1 ): 
        k2 = ( kbins[j] + kbins[j+1] ) / 2.0

        BL = np.zeros( 10 )      # <---------------------------------- see 'p' loop

        for k in range( nb - 1 ):
            mu = ( mubins[k] + mubins[k+1] ) / 2.
            k3 =  np.sqrt( k1 + k2 - 2 * mu )

            x = some_function(k1,k2,k3)
            y = some_function(k1,k3,k2)
            z = some_function(k2,k3,k1)

            B = x + y + z

            for p in range( 10 ):
                BL[p] = BL[p] + 2. * B

            for q in range( 10 ):
                BLB[q] = BLB[q] + BL[q]

所以我尝试将其设置为 p 循环,如下所示:

kbins = linspace( -1, 1, 100 ) )                  # as before

# i now define k1 and k2 here as vectors, and not scalars as was above example
kk       = np.zeros( nbk )
kk[:-1]  = ( kbins[:-1] + kbins[1:] ) / 2.0

k1       = kk[np.newaxis].T                       # equivalent to k1 in above
k2       = kk                                     # equivalent to k2 in above

mu[:-1]  = ( mubins[:-1] + mubins[1:] ) /2.       # mu is now an array as well

for k in range( nbk - 1 ):                        # i,j-loops removed,left with k
    k3[:-1,:-1,k] = np.sqrt( k1[:-1] + k2[:-1] - 2 * mu[k] )     # k3 is now an array
    x[ :-1,:-1,k] = some_function(k1[:-1],k2[:-1],k3[:-1,:-1,k]) # x,y,z now arrays to allow looping over elements
    y[ :-1,:-1,k] = some_function(k1[:-1],k3[:-1,:-1,k],k2[:-1])
    z[ :-1,:-1,k] = some_function(k2[:-1],k3[:-1,:-1,k],k1[:-1])

    B11[:-1,:-1,k]= x[:-1,:-1,k] +  y[:-1,:-1,k] + z[:-1,:-1,k]

如何在相应的 {{1}中计算 BL BLB } p 循环?

我希望这更有意义。

1 个答案:

答案 0 :(得分:0)

看起来some_function需要3个标量并返回一个标量。 B也是标量。所以

        B = x + y + z

        for p in range( 10 ):
            BL[p] = BL[p] + 2. * B

        for q in range( 10 ):
            BLB[q] = BLB[q] + BL[q]

可以简化为:

BL += 2*B
BLB += BL

我没有看到重复pq的重点。如上所述,BL的所有10个值都是相同的,BLB的10个值也是如此。

当我尝试运行您的脚本时(使用some_function这样的简单k1+k2+k3),我在k3 = np.sqrt( k1 + k2 - 2 * mu )收到了一个间歇性错误,可能是因为k1+k2-2*mu会变得消极。但忽略这一点,我认为你的脚本简化为:

nb     = 11
kbins  = np.linspace( -1, 1, nb )
mubins = np.linspace( -5, 5, nb )

kx = (kbins[:-1]+kbins[1:])/2.0
kmx = (mubins[:-1]+mubins[1:])/2.0

for k1 in kx:
    BLB = 0
    for k2 in kx:
        BL = 0
        for mu in kmx:
            B = k1+k2-2*mu
            BL += 2*B
        BLB += BL
    print BLB

在这里,我玩弄了BLBLB总结的嵌套,所以他们有所了解。

如果内部标量计算可以推广到采用3D数组,我们可以对整个事物进行矢量化。

k1 = kx[:, None, None]
k2 = kx[None, :, None]
mu = kmx[None, None,:]
B = 2*(k1 + k2 - 2*mu)
# (nb-1, nb-1, nb-1)
BL = np.sum(B, axis=2)
BLB = np.sum(BL, axis=1)
print BLB

回顾第一个脚本,p循环有点复杂

for p in range( Nmodes ):
            Bl[p] += X[k,p] * B

应该起作用(假设X是某种2d系数数组):

Bl += X[k,:] * B