Question

我正在寻找一种有效的方法来将Numpy中的矩阵列表相乘。我有一个像这样的矩阵：

import numpy as np
a = np.random.randn(1000, 4, 4)

我想沿长轴矩阵乘法，因此结果是4x4矩阵。很明显我能做到：

res = np.identity(4)
for ai in a:
    res = np.matmul(res, ai)

但这太慢了。有没有更快的方法（可能使用einsum或其他一些我还不完全理解的功能？

Answer 1

对于大小为2的堆栈，需要log_2(n) for循环交互的解决方案可以是

while len(a) > 1:
    a = np.matmul(a[::2, ...], a[1::2, ...])

基本上迭代地将两个相邻矩阵相乘，直到只剩下一个矩阵，每次迭代执行剩余乘法的一半。

res = A * B * C * D * ...         # 1024 remaining multiplications

变为

res = (A * B) * (C * D) * ...     # 512 remaining multiplications

变为

res = ((A * B) * (C * D)) * ...   # 256 remaining multiplications

等

对于2的非幂，您可以对第一个2^n矩阵执行此操作，并将算法用于剩余的矩阵。

Answer 2

np.linalg.multi_dot做了这种链接。

In [119]: a = np.random.randn(5, 4, 4)
In [120]: res = np.identity(4)
In [121]: for ai in a: res = np.matmul(res, ai)
In [122]: res
Out[122]: 
array([[ -1.04341835,  -1.22015464,   9.21459712,   0.97214725],
       [ -0.13652679,   0.61012689,  -0.07325689,  -0.17834132],
       [ -2.45684401,  -1.76347514,  12.41094524,   1.00411347],
       [ -8.36738671,  -6.5010718 ,  15.32489832,   3.62426123]])
In [123]: np.linalg.multi_dot(a)
Out[123]: 
array([[ -1.04341835,  -1.22015464,   9.21459712,   0.97214725],
       [ -0.13652679,   0.61012689,  -0.07325689,  -0.17834132],
       [ -2.45684401,  -1.76347514,  12.41094524,   1.00411347],
       [ -8.36738671,  -6.5010718 ,  15.32489832,   3.62426123]])

但速度较慢，每回路92.3μs，每回路22.2μs。对于1000件商品，测试时间仍在运行。

在确定一些“最佳订单”后，multi_dot执行递归dot。

def _multi_dot(arrays, order, i, j):
    """Actually do the multiplication with the given order."""
    if i == j:
        return arrays[i]
    else:
        return dot(_multi_dot(arrays, order, i, order[i, j]),
                   _multi_dot(arrays, order, order[i, j] + 1, j))

在1000项情况下，这会发生递归深度错误。

将Numpy中的矩阵列表相乘

2 个答案: