矩阵与向量之间的欧几里德距离

时间:2017-04-12 16:19:39

标签: python numpy

从另一个向量的每列计算向量的欧几里德。 这是对的吗?

distances=np.sqrt(np.sum(np.square(new_v-val.reshape(10,1)),axis=0))

new_v是一个矩阵。 val.reshape(10,1)是一个列向量。 另一种/更好的方法。

2 个答案:

答案 0 :(得分:2)

你拥有的是正确的。 numpy.linalg中有一种更简单的方法:

from numpy.linalg import norm
norm(new_v.T-val, axis=1, ord=2)

答案 1 :(得分:0)

您可以使用高效的np.einsum -

subs = new_v - val[:,None]
out = np.sqrt(np.einsum('ij,ij->j',subs,subs))

或者,使用(a-b)^2 = a^2 + b^2 - 2ab公式 -

out = np.sqrt(np.einsum('ij,ij->j',new_v, new_v) + val.dot(val) - 2*val.dot(new_v))

如果new_v的第二个轴很大,我们也可以numexpr模块在​​最后计算sqrt部分。

运行时测试

方法 -

import numexpr as ne

def einsum_based(new_v, val):
    subs = new_v - val[:,None]
    return np.sqrt(np.einsum('ij,ij->j',subs,subs))

def dot_based(new_v, val):
    return np.sqrt(np.einsum('ij,ij->j',new_v, new_v) + \
                            val.dot(val) - 2*val.dot(new_v))

def einsum_numexpr_based(new_v, val):
    subs = new_v - val[:,None]
    sq_dists = np.einsum('ij,ij->j',subs,subs)
    return ne.evaluate('sqrt(sq_dists)')

def dot_numexpr_based(new_v, val):
    sq_dists = np.einsum('ij,ij->j',new_v, new_v) + val.dot(val) - 2*val.dot(new_v)
    return ne.evaluate('sqrt(sq_dists)')

计时 -

In [85]: # Inputs
    ...: new_v = np.random.randint(0,9,(10,100000))
    ...: val = np.random.randint(0,9,(10))


In [86]: %timeit np.sqrt(np.sum(np.square(new_v-val.reshape(10,1)),axis=0))
    ...: %timeit einsum_based(new_v, val)
    ...: %timeit dot_based(new_v, val)
    ...: %timeit einsum_numexpr_based(new_v, val)
    ...: %timeit dot_numexpr_based(new_v, val)
    ...: 
100 loops, best of 3: 2.91 ms per loop
100 loops, best of 3: 2.1 ms per loop
100 loops, best of 3: 2.12 ms per loop
100 loops, best of 3: 2.26 ms per loop
100 loops, best of 3: 2.43 ms per loop

In [87]: from numpy.linalg import norm

# @wim's solution
In [88]: %timeit norm(new_v.T-val, axis=1, ord=2)
100 loops, best of 3: 5.88 ms per loop