高维中的Python Numpy矩阵乘法

时间:2014-05-10 04:03:18

标签: python numpy matrix machine-learning linear-algebra

我正在尝试在numpy中查找矩阵运算,以加快后续计算。

我有两个3D矩阵AB。第一个维度表示示例,并且它们都有n_examples个示例。我想要实现的是在A和B中对每个例子加点产品并对结果求和:

import numpy as np

n_examples = 10
A = np.random.randn(n_examples, 20,30)
B = np.random.randn(n_examples, 30,5)
sum = np.zeros([20,5])
for i in range(len(A)):
  sum += np.dot(A[i],B[i])

2 个答案:

答案 0 :(得分:4)

这是np.tensordot()的典型应用程序:

sum = np.tensordot(A, B, [[0,2],[0,1]])

<强>时序

使用以下代码:

import numpy as np

n_examples = 100
A = np.random.randn(n_examples, 20,30)
B = np.random.randn(n_examples, 30,5)

def sol1():
    sum = np.zeros([20,5])
    for i in range(len(A)):
      sum += np.dot(A[i],B[i])
    return sum

def sol2():
    return np.array(map(np.dot, A,B)).sum(0)

def sol3():
    return np.einsum('nmk,nkj->mj',A,B)

def sol4():
    return np.tensordot(A, B, [[2,0],[1,0]])

def sol5():
    return np.tensordot(A, B, [[0,2],[0,1]])

结果:

timeit sol1()
1000 loops, best of 3: 1.46 ms per loop

timeit sol2()
100 loops, best of 3: 4.22 ms per loop

timeit sol3()
1000 loops, best of 3: 1.87 ms per loop

timeit sol4()
10000 loops, best of 3: 205 µs per loop

timeit sol5()
10000 loops, best of 3: 172 µs per loop

在我的计算机上tensordot()是最快的解决方案,并且更改轴的评估顺序并没有改变结果和性能。

答案 1 :(得分:2)

哈,它只能在一行中完成:np.einsum('nmk,nkj->mj',A,B)

见爱因斯坦求和:http://docs.scipy.org/doc/numpy/reference/generated/numpy.einsum.html

不一样的问题但是这个想法大致相同,请参阅我们刚刚讨论过的这个主题中的讨论和替代方法:numpy multiply matrices preserve third axis

不要为变量sum命名,而是覆盖内置sum

正如@Jaime所指出的,对于这些尺寸的尺寸,循环实际上更快。事实上,基于mapsum的解决方案虽然更简单,但速度更慢:

In [19]:

%%timeit
SUM = np.zeros([20,5])
for i in range(len(A)):
  SUM += np.dot(A[i],B[i])
10000 loops, best of 3: 115 µs per loop
In [20]:

%timeit np.array(map(np.dot, A,B)).sum(0)
1000 loops, best of 3: 445 µs per loop
In [21]:

%timeit np.einsum('nmk,nkj->mj',A,B)
1000 loops, best of 3: 259 µs per loop

大尺寸的东西是不同的:

n_examples = 1000
A = np.random.randn(n_examples, 20,1000)
B = np.random.randn(n_examples, 1000,5)

In [46]:

%%timeit
SUM = np.zeros([20,5])
for i in range(len(A)):
  SUM += np.dot(A[i],B[i])
1 loops, best of 3: 191 ms per loop
In [47]:

%timeit np.array(map(np.dot, A,B)).sum(0)
1 loops, best of 3: 164 ms per loop
In [48]:

%timeit np.einsum('nmk,nkj->mj',A,B)
1 loops, best of 3: 451 ms per loop