我正在研究python项目并利用numpy。我经常需要通过单位矩阵计算矩阵的Kronecker乘积。这些是我的代码中相当大的瓶颈,所以我想优化它们。我必须采取两种产品。第一个是:
np.kron(np.eye(N), A)
只需使用scipy.linalg.block_diag即可轻松优化此版本。该产品相当于:
la.block_diag(*[A]*N)
大约快10倍。但是,我不确定如何优化第二类产品:
np.kron(A, np.eye(N))
我可以使用类似的技巧吗?
答案 0 :(得分:3)
一种方法是初始化4D
的输出数组,然后从A
为其分配值。这样的任务将广播价值,这是我们在NumPy中获得效率的地方。
因此,解决方案就是这样 -
# Get shape of A
m,n = A.shape
# Initialize output array as 4D
out = np.zeros((m,N,n,N))
# Get range array for indexing into the second and fourth axes
r = np.arange(N)
# Index into the second and fourth axes and selecting all elements along
# the rest to assign values from A. The values are broadcasted.
out[:,r,:,r] = A
# Finally reshape back to 2D
out.shape = (m*N,n*N)
作为一个功能 -
def kron_A_N(A, N): # Simulates np.kron(A, np.eye(N))
m,n = A.shape
out = np.zeros((m,N,n,N),dtype=A.dtype)
r = np.arange(N)
out[:,r,:,r] = A
out.shape = (m*N,n*N)
return out
要模拟np.kron(np.eye(N), A)
,只需按第一和第二轴交换操作,类似地换第三和第四轴 -
def kron_N_A(A, N): # Simulates np.kron(np.eye(N), A)
m,n = A.shape
out = np.zeros((N,m,N,n),dtype=A.dtype)
r = np.arange(N)
out[r,:,r,:] = A
out.shape = (m*N,n*N)
return out
计时 -
In [174]: N = 100
...: A = np.random.rand(100,100)
...:
In [175]: np.allclose(np.kron(A, np.eye(N)), kron_A_N(A,N))
Out[175]: True
In [176]: %timeit np.kron(A, np.eye(N))
1 loops, best of 3: 458 ms per loop
In [177]: %timeit kron_A_N(A, N)
10 loops, best of 3: 58.4 ms per loop
In [178]: 458/58.4
Out[178]: 7.842465753424658