基于Numpy的递归函数撤销最大池的瓶颈

时间:2017-01-17 01:50:32

标签: python numpy recursion

我设计了一个递归函数来处理深度学习社区中的特定问题。对于大多数情况来说,它似乎很快就能很好地工作,但是对于其他情况来说似乎没有理由需要大约20分钟。在最简单的情况下,这个功能可以被抽象为简单的“重复”"重复"在两个轴上起作用。这是我用来测试这个函数的代码:

def recursive_upsample(fMap, index, dims):
    if index == 0:
        return fMap
    else:
        start = time.time()
        upscale = np.zeros((dims[index-1][0],dims[index-1][1],fMap.shape[-1]))
        if dims[index-1][0] % 2 == 1 and dims[index-1][1] % 2 == 1:
            crop = fMap[:fMap.shape[0]-1,:fMap.shape[1]-1]
            consX = fMap[-1,:][:-1]
            consY = fMap[:,-1][:-1]
            corner = fMap[-1,-1]
            crop = crop.repeat(2, axis=0).repeat(2, axis=1)
            upscale[:crop.shape[0],:crop.shape[1]] = crop
            upscale[-1,:][:-1] = consX.repeat(2,axis=0)
            upscale[:,-1][:-1] = consY.repeat(2,axis=0)
            upscale[-1,-1] = corner

        elif dims[index-1][0] % 2 == 1:
            crop = fMap[:fMap.shape[0]-1]
            consX = fMap[-1:,]
            crop = crop.repeat(2, axis=0).repeat(2, axis=1)
            upscale[:crop.shape[0]] = crop
            upscale[-1:,] = consX.repeat(2,axis=1)

        elif dims[index-1][1] % 2 == 1:
            crop = fMap[:,:fMap.shape[1]-1]
            consY = fMap[:,-1]
            crop = crop.repeat(2, axis=0).repeat(2, axis=1)
            upscale[:,:crop.shape[1]] = crop
            upscale[:,-1] = consY.repeat(2,axis=0)


        else:
            upscale = fMap.repeat(2, axis=0).repeat(2, axis=1)

        print('Upscaling from {} to {} took {} seconds'.format(fMap.shape,upscale.shape,time.time() - start))
        fMap = upscale

        return recursive_upsample(fMap,index-1,dims)

if __name__ == '__main__':
    dims = [(634,1020,64),(317,510,128),(159,255,256),(80,128,512),(40,64,512)]
    images = []
    for dim in dims:
        image = np.random.rand(dim[0],dim[1],dim[2])
        images.append(image)
    start = time.time()
    upsampled = []
    for index,image in enumerate(images):
        upsampled.append(recursive_upsample(image,index,dims))
    print('Upsampling took {} seconds'.format(time.time() - start))

由于一些奇怪的原因,在形状(40,64,512)的形状特征映射从形状(317,510,512)到(634,1020,512)上采样的递归中的点需要911秒!我开始使用Theano重写此代码,但是我应该查看代码的一些潜在问题吗?我现在的理由是在CPU上计算这个是笨重的,但我不确定这么简单的功能是什么。此外,任何有关如何使这项功能更快的提示将不胜感激!

1 个答案:

答案 0 :(得分:0)

没有必要进行递归。例如。您可以直接执行(40,64,512)图片:

upsampled = image.repeat(16, axis=0).repeat(16, axis=1)[:634,:1020]