我已经根据列表中的n个项目编写了一些代码来计算n个矩阵,然后在最后将所有矩阵相乘。
代码相对较慢,我想了解有关python优化的更多信息。我已经使用了分析工具,并确定了我的程序中的减速是这个矩阵乘法循环。
我想知道是否有人建议如何加快速度,或许利用Python / NumPy中内置的基于C的函数?
def my_matrix(x):
# Initialise overall matrix as an identity matrix
# | M_11 M_12 |
# | M_21 M_22 |
M = np.matrix([[1, 0],[0, 1]])
for z in z_all:
param1 = func1(z)
param2 = func2(x, z)
param3 = func3(x, z)
M_11 = param1 + param2
M_12 = param1 - param2
M_21 = param1 * param2
M_22 = param1 / param2
# Multiply matrix with overall master matrix
M = M * np.matrix([[M_11, M_12],[M_21, M_22]])
return M
从一个小的背景阅读,似乎函数调用计算成本很高,因此,为我的参数计算数组然后访问数组可能比每次在循环中评估函数更有效...例如
param1s = funcs(z_all)
param2s = funcs(x, z_all)
etc
然后在for循环中:
for i, z in enumerate(z_all):
param1 = params1[i]
param2 = params2[i]
etc.
这个更快,但只有大约10%,因为较少的函数调用的保存被循环中使用param1 = params1 [i]的数组访问所花费的时间所抵消。
有人有任何建议吗?
答案 0 :(得分:1)
您可以通过执行M_11, ... M_22
等来对M_11s = params1 + params2
的计算进行矢量化。
这样你只需要在循环中进行矩阵乘法:
import numpy as np
...
# compute your 'params' over vectors of z-values
param1s = func1(z_all)
param2s = func2(x, z_all)
param3s = func3(x, z_all) # you don't seem to be using this for anything...
# compute 'M_11, ... M_22'
M_11 = param1s + param2s
M_12 = param1s - param2s
M_21 = param1s * param2s
M_22 = param1s / param2s
# we construct a (2, 2, nz) array from these
M_all = np.array([[M_11, M_12], [M_21, M_22]])
# roll the 'nz' axis to the front so that its shape is (nz, 2, 2)
M_all = np.rollaxis(M_all, -1, 0)
# initialize output with the identity
M_out = np.eye(2)
# loop over each (2, 2) subarray in 'M_all', update the output with the
# corresponding dot product
for mm in M_all:
M_out = M_out.dot(mm)