假设我有两个数组a
和b
,
a.shape = (5,2,3)
b.shape = (2,3)
然后c = a * b
会给我一个c
形状(5,2,3)
的数组c[i,j,k] = a[i,j,k]*b[j,k]
。
现在的情况是,
a.shape = (5,2,3)
b.shape = (2,3,8)
我希望c
的形状为(5,2,3,8)
c[i,j,k,l] = a[i,j,k]*b[j,k,l]
。
如何有效地做到这一点?我的a
和b
实际上非常大。
答案 0 :(得分:13)
这应该有效:
a[..., numpy.newaxis] * b[numpy.newaxis, ...]
用法:
In : a = numpy.random.randn(5,2,3)
In : b = numpy.random.randn(2,3,8)
In : c = a[..., numpy.newaxis]*b[numpy.newaxis, ...]
In : c.shape
Out: (5, 2, 3, 8)
参考:Array Broadcasting in numpy
编辑:更新了参考网址
答案 1 :(得分:7)
我认为以下内容应该有效:
import numpy as np
a = np.random.normal(size=(5,2,3))
b = np.random.normal(size=(2,3,8))
c = np.einsum('ijk,jkl->ijkl',a,b)
和
In [5]: c.shape
Out[5]: (5, 2, 3, 8)
In [6]: a[0,0,1]*b[0,1,2]
Out[6]: -0.041308376453821738
In [7]: c[0,0,1,2]
Out[7]: -0.041308376453821738
使用 np.einsum
可能有点棘手,但对于这些索引问题非常强大:
http://docs.scipy.org/doc/numpy/reference/generated/numpy.einsum.html
另请注意,这需要numpy> = v1.6.0
我不确定你的特定问题的效率,但如果它的表现不如所需,那么一定要考虑使用Cython和显式的for循环,并且可能使用prange
<强>更新强>
In [18]: %timeit np.einsum('ijk,jkl->ijkl',a,b)
100000 loops, best of 3: 4.78 us per loop
In [19]: %timeit a[..., np.newaxis]*b[np.newaxis, ...]
100000 loops, best of 3: 12.2 us per loop
In [20]: a = np.random.normal(size=(50,20,30))
In [21]: b = np.random.normal(size=(20,30,80))
In [22]: %timeit np.einsum('ijk,jkl->ijkl',a,b)
100 loops, best of 3: 16.6 ms per loop
In [23]: %timeit a[..., np.newaxis]*b[np.newaxis, ...]
100 loops, best of 3: 16.6 ms per loop
In [2]: a = np.random.normal(size=(500,20,30))
In [3]: b = np.random.normal(size=(20,30,800))
In [4]: %timeit np.einsum('ijk,jkl->ijkl',a,b)
1 loops, best of 3: 3.31 s per loop
In [5]: %timeit a[..., np.newaxis]*b[np.newaxis, ...]
1 loops, best of 3: 2.6 s per loop