使用3D中更快的矩阵运算替换顺序乘积和总和

时间:2017-05-05 08:29:45

标签: numpy theano matrix-multiplication numpy-broadcasting numpy-einsum

在我目前的theano脚本中,瓶颈是以下代码:

import numpy as np

axis = 0
prob = np.random.random( ( 1, 1000, 50 ) )
cases = np.random.random( ( 1000, 1000, 50 ) )

start = time.time(  )
for i in xrange( 1000 ):
    result = ( cases * prob ).sum( axis=1-axis, keepdims=True )
print '3D naive method took {} seconds'.format( time.time() - start )
print result.shape
print

我在2D情况下看到用点积替换elementwise + sum给了我5倍的加速。在这种情况下,是否有任何矩阵操作可以帮助我?

修改

Divakar 给了我一个基于 einsum 的版本。但是,我的目的是将其移植到 theano theano 不支持 einsum 。因此,欢迎使用 theano 的替代品。

1 个答案:

答案 0 :(得分:1)

我们可以使用np.einsum -

result = np.einsum('ijk,ijk->ik', prob, cases)[:,None,:]

另一个np.matmul -

result = np.matmul(prob.transpose(2,0,1), cases.T).T

运行时测试 -

In [70]: axis = 0
    ...: prob = np.random.random( ( 1, 1000, 50 ) )
    ...: cases = np.random.random( ( 1000, 1000, 50 ) )
    ...: 

In [71]: out1 = ( cases * prob ).sum( axis=1-axis, keepdims=True )

In [72]: out2 = np.einsum('ijk,ijk->ik', prob, cases)[:,None,:]

In [73]: out3 = np.matmul(prob.transpose(2,0,1), cases.T).T

In [74]: np.allclose(out1, out2)
Out[74]: True

In [75]: np.allclose(out1, out3)
Out[75]: True

In [76]: %timeit ( cases * prob ).sum( axis=1-axis, keepdims=True )
10 loops, best of 3: 101 ms per loop

In [77]: %timeit np.einsum('ijk,ijk->ik', prob, cases)[:,None,:]
10 loops, best of 3: 44.1 ms per loop

In [78]: %timeit np.matmul(prob.transpose(2,0,1), cases.T).T
10 loops, best of 3: 44 ms per loop