Question

我将大部分计算时间花在scipy.ndimage.filters.laplace()

上

scipy和numpy的主要优势是C/C++中包含python的矢量化计算。 scipy.ndimage.filters.laplace()源自_nd_image.correlate1d 优化库nd_image.h

的一部分

是否有更快的方法在大小为10-100的数组中执行此操作？

定义 拉普拉斯过滤器 - 忽略除法

a[i-1] - 2*a[i] + a[i+1]
可选理想情况下可以包裹a[n-1] - 2*a[n-1] + a[0]的边界n=a.shape[0]

Answer 1

问题源于scipy出色的错误处理和调试。但是，在实例中，用户知道他们正在做什么，只是提供了额外的开销。

以下代码会删除python后端的所有scipy混乱，并直接访问C++函数以加快~6x速度！

laplace == Mine ? True
testing timings...
array size 10
100000 loops, best of 3: 12.7 µs per loop
100000 loops, best of 3: 2.3 µs per loop
array size 100
100000 loops, best of 3: 12.7 µs per loop
100000 loops, best of 3: 2.5 µs per loop
array size 100000
1000 loops, best of 3: 413 µs per loop
1000 loops, best of 3: 404 µs per loop

代码

from scipy import ndimage
from scipy.ndimage import _nd_image
import numpy as np

laplace_filter = np.asarray([1, -2, 1], dtype=np.float64)

def fastLaplaceNd(arr):
    output = np.zeros(arr.shape, 'float64')
    if arr.ndim > 0:
        _nd_image.correlate1d(arr, laplace_filter, 0, output, 1, 0.0, 0)
        if arr.ndim == 1: return output
        for ax in xrange(1, arr.ndim):
            output += _nd_image.correlate1d(arr, laplace_filter, ax, output, 1, 0.0, 0)
    return output

if __name__ == '__main__':
    arr = np.random.random(10)
    test = (ndimage.filters.laplace(arr, mode='wrap') == fastLaplace(arr)).all()
    assert test
    print "laplace == Mine ?", test
    print 'testing timings...'
    print "array size 10"
    %timeit ndimage.filters.laplace(arr, mode='wrap')
    %timeit fastLaplace(arr)
    print 'array size 100'
    arr = np.random.random(100)
    %timeit ndimage.filters.laplace(arr, mode='wrap')
    %timeit fastLaplace(arr)
    print "array size 100000"
    arr = np.random.random(100000)
    %timeit ndimage.filters.laplace(arr, mode='wrap')
    %timeit fastLaplace(arr)

对于小数组，比scipy.ndimage.filters.laplace更快的离散拉普拉斯算子

1 个答案:

代码