Question

我有两个点向量，x和y，分别为(n, p)和(m, p)。举个例子：

x = np.array([[ 0.     , -0.16341,  0.98656],
              [-0.05937, -0.25205,  0.96589],
              [ 0.05937, -0.25205,  0.96589],
              [-0.11608, -0.33488,  0.93508],
              [ 0.     , -0.33416,  0.94252]])
y = np.array([[ 0.     , -0.36836,  0.92968],
              [-0.12103, -0.54558,  0.82928],
              [ 0.12103, -0.54558,  0.82928]])

我想计算一个(n, m)大小的矩阵，其中包含两个点之间的角度，即this个问题。也就是说，矢量化版本：

theta = np.array(
            [ np.arccos(np.dot(i, j) / (la.norm(i) * la.norm(j)))
                 for i in x for j in y ]
        ).reshape((n, m))

注意：n和m各可为~10000。

Answer 1

有多种方法可以做到这一点：

import numpy.linalg as la
from scipy.spatial import distance as dist

# Manually
def method0(x, y):
    dotprod_mat = np.dot(x,  y.T)
    costheta = dotprod_mat / la.norm(x, axis=1)[:, np.newaxis]
    costheta /= la.norm(y, axis=1)
    return np.arccos(costheta)

# Using einsum
def method1(x, y):
    dotprod_mat = np.einsum('ij,kj->ik', x, y)
    costheta = dotprod_mat / la.norm(x, axis=1)[:, np.newaxis]
    costheta /= la.norm(y, axis=1)
    return np.arccos(costheta)

# Using scipy.spatial.cdist (one-liner)
def method2(x, y):
    costheta = 1 - dist.cdist(x, y, 'cosine')
    return np.arccos(costheta)

# Realize that your arrays `x` and `y` are already normalized, meaning you can
# optimize method1 even more
def method3(x, y):
    costheta = np.einsum('ij,kj->ik', x, y) # Directly gives costheta, since
                                            # ||x|| = ||y|| = 1
    return np.arccos(costheta)

（n，m）=（1212,252）的定时结果：

>>> %timeit theta = method0(x, y)
100 loops, best of 3: 11.1 ms per loop
>>> %timeit theta = method1(x, y)
100 loops, best of 3: 10.8 ms per loop
>>> %timeit theta = method2(x, y)
100 loops, best of 3: 12.3 ms per loop
>>> %timeit theta = method3(x, y)
100 loops, best of 3: 9.42 ms per loop

随着元素数量的增加，时序差异减小。对于（n，m）=（6252,1212）：

>>> %timeit -n10 theta = method0(x, y)
10 loops, best of 3: 365 ms per loop
>>> %timeit -n10 theta = method1(x, y)
10 loops, best of 3: 358 ms per loop
>>> %timeit -n10 theta = method2(x, y)
10 loops, best of 3: 384 ms per loop
>>> %timeit -n10 theta = method3(x, y)
10 loops, best of 3: 314 ms per loop

但是，如果您省略np.arccos步骤，即假设您只能使用costheta进行管理，并且 theta本身，然后：

>>> %timeit costheta = np.einsum('ij,kj->ik', x, y) 10 loops, best of 3: 61.3 ms per loop >>> %timeit costheta = 1 - dist.cdist(x, y, 'cosine') 10 loops, best of 3: 124 ms per loop >>> %timeit costheta = dist.cdist(x, y, 'cosine') 10 loops, best of 3: 112 ms per loop

这是针对（6252,1212）的情况。实际上np.arccos实际占据了80％的时间。在这种情况下，我发现np.einsum 比dist.cdist更快。所以你肯定想要使用einsum。

摘要：theta的结果大致相似，但np.einsum对我来说速度最快，特别是当您没有无关地计算规范时。尽量避免计算theta并仅使用costheta。

注意：我没有提到的一个重点是浮点精度的有限性会导致np.arccos给出nan值。 method[0:3]工作的x和y的值自然没有得到正确的规范化。但method3提供了一些nan。我使用预归一化来修复它，这自然会破坏使用method3的任何增益，除非你需要对一小组预标准化矩阵进行多次计算（无论出于何种原因）。

计算两个点阵列之间成对角度的矩阵

1 个答案: