Question

一年前我几乎和提问者的情况差不多： fast way to invert or dot kxnxn matrix

所以我有一个张量为指数a [n，i，j]的张量（N，M，M），我想反转N中每个n的M * M方阵部分。

例如，假设我有

In [1]:    a = np.arange(12)
           a.shape = (3,2,2)
           a

Out[1]: array([[[ 0,  1],
                  [ 2,  3]],

                  [[ 4,  5],
                  [ 6,  7]],

                  [[ 8,  9],
                  [10, 11]]])

然后for循环反转会像这样：

In [2]: inv_a = np.zeros([3,2,2])
        for m in xrange(0,3):
            inv_a[m] = np.linalg.inv(a[m])
        inv_a

Out[2]: array([[[-1.5,  0.5],
                  [ 1. ,  0. ]],

                  [[-3.5,  2.5],
                  [ 3. , -2. ]],

                  [[-5.5,  4.5],
                 [ 5. , -4. ]]])

这显然可以在NumPy 2.0中实现，根据github上的this issue ......

我想我需要安装dev版本作为seith在github问题线程中提到的，但现在有另外一种方法以矢量化方式执行此操作吗？

Answer 1

<强>更新在NumPy 1.8及更高版本中，numpy.linalg中的函数是通用的通用函数。这意味着您现在可以执行以下操作：

import numpy as np
a = np.random.rand(12, 3, 3)
np.linalg.inv(a)

这将反转每个3x3阵列并将结果作为12x3x3阵列返回。请参阅numpy 1.8 release notes。

原始答案：

由于N相对较小，我们如何为所有矩阵一次手动计算LU分解。这可以确保所涉及的for循环相对较短。

以下是使用普通NumPy语法完成此操作的方法：

import numpy as np
from numpy.random import rand

def pylu3d(A):
    N = A.shape[1]
    for j in xrange(N-1):
        for i in xrange(j+1,N):
            #change to L
            A[:,i,j] /= A[:,j,j]
            #change to U
            A[:,i,j+1:] -= A[:,i,j:j+1] * A[:,j,j+1:]

def pylusolve(A, B):
    N = A.shape[1]
    for j in xrange(N-1):
        for i in xrange(j+1,N):
            B[:,i] -= A[:,i,j] * B[:,j]
    for j in xrange(N-1,-1,-1):
        B[:,j] /= A[:,j,j]
        for i in xrange(j):
            B[:,i] -= A[:,i,j] * B[:,j]

#usage
A = rand(1000000,3,3)
b = rand(3)
b = np.tile(b,(1000000,1))
pylu3d(A)
# A has been replaced with the LU decompositions
pylusolve(A, b)
# b has been replaced to the solutions of
# A[i] x = b[i] for each A[i] and b[i]

正如我所写，pylu3d修改了A来计算LU分解。用LU分解替换每个N x N矩阵后，pylusolve可用于求解M x N数组b代表矩阵系统的右侧。它会修改b并进行适当的后置替换以解决系统问题。在编写时，这个实现不包括透视，因此它在数值上不稳定，但在大多数情况下它应该运行良好。

根据数组在内存中的排列方式，使用Cython可能还要快一点。以下是两个执行相同操作的Cython函数，但它们首先沿M迭代。它没有矢量化，但速度相对较快。

from numpy cimport ndarray as ar
cimport cython

@cython.boundscheck(False)
@cython.wraparound(False)
def lu3d(ar[double,ndim=3] A):
    cdef int n, i, j, k, N=A.shape[0], h=A.shape[1], w=A.shape[2]
    for n in xrange(N):
        for j in xrange(h-1):
            for i in xrange(j+1,h):
                #change to L
                A[n,i,j] /= A[n,j,j]
                #change to U
                for k in xrange(j+1,w):
                    A[n,i,k] -= A[n,i,j] * A[n,j,k]

@cython.boundscheck(False)
@cython.wraparound(False)
def lusolve(ar[double,ndim=3] A, ar[double,ndim=2] b):
    cdef int n, i, j, N=A.shape[0], h=A.shape[1]
    for n in xrange(N):
        for j in xrange(h-1):
            for i in xrange(j+1,h):
                b[n,i] -= A[n,i,j] * b[n,j]
        for j in xrange(h-1,-1,-1):
            b[n,j] /= A[n,j,j]
            for i in xrange(j):
                b[n,i] -= A[n,i,j] * b[n,j]

您也可以尝试使用Numba，但在这种情况下，我无法让它像Cython一样快速运行。

具有numpy的N * M * M张量的向量化（部分）逆

1 个答案: