我有以下格式的矩阵:
matrix = np.array([1, 2, 3, np.nan],
[1, np.nan, 3, 4],
[np.nan, 2, 3, np.nan])
和系数我想选择性地将元素与我的矩阵相乘:
coefficients = np.array([0.5, np.nan, 0.2, 0.3],
[0.3, 0.3, 0.2, np.nan],
[np.nan, 0.2, 0.1, np.nan])
在这种情况下,我希望matrix
中的第一行与coefficients
中的第二行相乘,而matrix
中的第二行将与第一行相乘在coefficients
。简而言之,我想根据coefficients
值所在的位置选择matrix
中与np.nan
中的行匹配的行。
np.nan
中每一行的coefficients
值的位置会有所不同,因为它们描述了不同数据可用情况下的系数。
有没有一种快速的方法可以做到这一点,不需要为所有可能的情况编写if语句?
答案 0 :(得分:2)
方法#1
快速方式适用于NumPy broadcasting
-
# Mask of NaNs
mask1 = np.isnan(matrix)
mask2 = np.isnan(coefficients)
# Perform comparison between each row of mask1 against every row of mask2
# leading to a 3D array. Look for all-matching ones along the last axis.
# These are the ones that shows the row matches between the two input arrays -
# matrix and coefficients. Then, we use find the corresponding matching
# indices that gives us the pair of matches betweel those two arrays
r,c = np.nonzero((mask1[:,None] == mask2).all(-1))
# Index into arrays with those indices and perform elementwise multiplication
out = matrix[r] * coefficients[c]
给定样本数据的输出 -
In [40]: out
Out[40]:
array([[ 0.3, 0.6, 0.6, nan],
[ 0.5, nan, 0.6, 1.2],
[ nan, 0.4, 0.3, nan]])
方法#2
为了提高性能,将每行NaNs掩码减少到其十进制等值,然后创建一个存储数组,我们可以在其中存储元素matrix
,然后乘以coefficients
之后的元素,这些元素由小数点后跟等价物 -
R = 2**np.arange(matrix.shape[1])
idx1 = mask1.dot(R)
idx2 = mask2.dot(R)
A = np.empty((idx1.max()+1, matrix.shape[1]))
A[idx1] = matrix
A[idx2] *= coefficients
out = A[idx1]