提高阵列处理的性能

时间:2013-10-12 02:02:10

标签: python arrays performance numpy

我有一个大代码需要一些时间来运行。我已经跟踪了大部分时间占用的两条线,我想知道是否有办法加速它们。这是一个MWE:

import numpy as np

def setup(k=2, m=100, n=300):
    return np.random.randn(k,m), np.random.randn(k,n),np.random.randn(k,m)
# make some random points and weights
a, b, w = setup()

# Weighted euclidean distance between arrays a and b.
wdiff = (a[np.newaxis,...] - b[np.newaxis,...].T) / w[np.newaxis,...]

# This is the set of operations that need a performance boost:
dist_1 = np.exp(-0.5*(wdiff*wdiff)) / w
dist_2 = np.array([i[0]*i[1] for i in dist_1])

我来自这个问题BTW Fast weighted euclidean distance between points in arrays其中 ali_m 提出了他的惊人答案,通过应用广播(其中我)节省了我很多时间什么都不知道,但至少可以用这些线来应用这样的东西吗?

1 个答案:

答案 0 :(得分:3)

您的dist_2计算速度可提高10倍左右:

>>> dist_1.shape
(300, 2, 100)
>>> %timeit dist_2 = np.array([i[0]*i[1] for i in dist_1])
1000 loops, best of 3: 1.35 ms per loop
>>> %timeit dist_2 = dist_1.prod(axis=1)
10000 loops, best of 3: 116 µs per loop
>>> np.allclose(np.array([i[0]*i[1] for i in dist_1]), dist_1.prod(axis=1))
True

我无法对你的dist_1做多少工作,因为大部分时间花在指数上:

>>> %timeit (-0.5*(wdiff*wdiff)) / w
1000 loops, best of 3: 467 µs per loop
>>> %timeit np.exp((-0.5*(wdiff*wdiff)))/w
100 loops, best of 3: 3.3 ms per loop