索引和矢量化嵌套循环

时间:2016-01-18 17:38:39

标签: python performance multidimensional-array vectorization nested-loops

我正在运行一个特定的脚本来计算输入数据的分形维数。虽然脚本运行正常但速度非常慢,但使用cProfile查看它表明函数boxcount约占运行时间的90%。我在之前的问题More efficient way to loop?Vectorization of a nested for-loop中遇到过类似的问题。在查看cProfile时,函数本身运行速度不慢,但在脚本中需要调用很多次。我正在努力寻找一种方法来重写它以消除大量的函数调用。以下是代码:

 for j in range(starty, endy):
        jmin=j-half_tile
        jmax=j+half_tile+1

    #   Loop over columns

        for i in range(startx, endx):
            imin=i-half_tile
            imax=i+half_tile+1

    #   Extract a subset of points from the input grid, centered on the current
    #   point. The size of tile is given by the current entry of the tile list.

            z = surface[imin:imax, jmin:jmax]
    #       print 'Tile created. Size:', z.shape

    #   Calculate fractal dimension of the tile using 3D box-counting

            fd, intercept = boxcount(z,dx,nside,cell,slice_size,box_size)
            FractalDim[i,j] = fd
            Lacunarity[i,j] = intercept

我真正的问题是,对于通过i,j的每个循环,它会找到imin,imax,jmin,jmax的值,它基本上是创建输入数据的子集,以imin,imax,jmin,jmax的值为中心。感兴趣的函数boxcount也在imin,imax,jmin,jmax的范围内进行评估。对于此示例,half_tile的值为6starty,endy,startx,endx的值分别为6,271,5,210dx,cell,nside,slice_size,box_size的值都只是boxcount函数中使用的常量。

我遇到了与此类似的问题,而不是将特定点周围的数据切片居中的复杂化。这可以被矢量化吗?还是改进了?

修改

以下是请求的函数boxcount的代码。

def boxcount(z,dx,nside,cell,slice_size,box_size):
    # fractal dimension calculation using box-counting method

    n = 5 # number of graph points for simple linear regression
    gx = [] # x coordinates of graph points
    gy = [] # y coordinates of graph points


    boxCount = np.zeros((5)) 
    cell_set = np.reshape(np.zeros((5*(nside**3))), (nside**3,5))

    nslice=nside**2

# Box is centered at the mid-point of the tile. Calculate for each point in the
# tile, which voxel the contains the point

    z0 = z[nside/2,nside/2]-dx*nside/2

    for j in range(1,13):
        for i in range(1,13):
            ij = (j-1)*12 + i
#            print 'i, j:', i, j

            delz1 = z[i-1,j-1]-z0
            delz2 = z[i-1,j]-z0
            delz3 = z[i,j-1]-z0
            delz4 = z[i,j]-z0
            delz = 0.25*(delz1+delz2+delz3+delz4)

            if delz < 0.0:
                break
            slice = ceil(delz)
#            print " delz:",delz," slice:",slice

            # Identify the voxel occupied by current point
            ijk = int(slice-1.)*nslice + (j-1)*nside + i

            for k in range(5):
                if cell_set[cell[ijk,k],k] != 1:
                   cell_set[cell[ijk,k],k] = 1

#   Set any cells deeper than this one equal to one aswell
#                   index = cell[ijk,k]
#                   for l in range(int(index),box_size[k],slice_size[k]):
#                       cell_set[l,k] = 1

# Count number of filled boxes for each box size
    boxCount = np.sum(cell_set,axis=0)
#   print "boxCount:", boxCount

    for ib in range(1,n+1):
#       print "ib:",ib," x(ib):",math.log(1.0/ib),"  y(ib):",math.log(boxCount[ib-1])
        gx.append( math.log(1.0/ib) )
        gy.append( math.log(boxCount[ib-1]) )

# simple linear regression
    m, b = np.polyfit(gx,gy,1)
#   print "Polyfit: Slope:", m,' Intercept:', b
#   fd = m-1
    fd = max(2.,m)
    return(fd,b)

0 个答案:

没有答案