我将大部分计算时间花在scipy.ndimage.filters.laplace()
scipy
和numpy
的主要优势是C/C++
中包含python
的矢量化计算。
scipy.ndimage.filters.laplace()
源自_nd_image.correlate1d
优化库nd_image.h
是否有更快的方法在大小为10-100
的数组中执行此操作?
定义 拉普拉斯过滤器 - 忽略除法
a[i-1] - 2*a[i] + a[i+1]
a[n-1] - 2*a[n-1] + a[0]
的边界n=a.shape[0]
答案 0 :(得分:2)
问题源于scipy
出色的错误处理和调试。但是,在实例中,用户知道他们正在做什么,只是提供了额外的开销。
以下代码会删除python
后端的所有scipy
混乱,并直接访问C++
函数以加快~6x
速度!
laplace == Mine ? True
testing timings...
array size 10
100000 loops, best of 3: 12.7 µs per loop
100000 loops, best of 3: 2.3 µs per loop
array size 100
100000 loops, best of 3: 12.7 µs per loop
100000 loops, best of 3: 2.5 µs per loop
array size 100000
1000 loops, best of 3: 413 µs per loop
1000 loops, best of 3: 404 µs per loop
from scipy import ndimage
from scipy.ndimage import _nd_image
import numpy as np
laplace_filter = np.asarray([1, -2, 1], dtype=np.float64)
def fastLaplaceNd(arr):
output = np.zeros(arr.shape, 'float64')
if arr.ndim > 0:
_nd_image.correlate1d(arr, laplace_filter, 0, output, 1, 0.0, 0)
if arr.ndim == 1: return output
for ax in xrange(1, arr.ndim):
output += _nd_image.correlate1d(arr, laplace_filter, ax, output, 1, 0.0, 0)
return output
if __name__ == '__main__':
arr = np.random.random(10)
test = (ndimage.filters.laplace(arr, mode='wrap') == fastLaplace(arr)).all()
assert test
print "laplace == Mine ?", test
print 'testing timings...'
print "array size 10"
%timeit ndimage.filters.laplace(arr, mode='wrap')
%timeit fastLaplace(arr)
print 'array size 100'
arr = np.random.random(100)
%timeit ndimage.filters.laplace(arr, mode='wrap')
%timeit fastLaplace(arr)
print "array size 100000"
arr = np.random.random(100000)
%timeit ndimage.filters.laplace(arr, mode='wrap')
%timeit fastLaplace(arr)