乘以模式匹配

时间:2017-05-11 16:45:43

标签: python numpy matrix pattern-matching

我有以下格式的矩阵:

matrix = np.array([1, 2, 3, np.nan], 
                  [1, np.nan, 3, 4],
                  [np.nan, 2, 3, np.nan])

和系数我想选择性地将元素与我的矩阵相乘:

coefficients = np.array([0.5, np.nan, 0.2, 0.3],
                        [0.3, 0.3, 0.2, np.nan],
                        [np.nan, 0.2, 0.1, np.nan])

在这种情况下,我希望matrix中的第一行与coefficients中的第二行相乘,而matrix中的第二行将与第一行相乘在coefficients。简而言之,我想根据coefficients值所在的位置选择matrix中与np.nan中的行匹配的行。

np.nan中每一行的coefficients值的位置会有所不同,因为它们描述了不同数据可用情况下的系数。

有没有一种快速的方法可以做到这一点,不需要为所有可能的情况编写if语句?

1 个答案:

答案 0 :(得分:2)

方法#1

快速方式适用于NumPy broadcasting -

# Mask of NaNs                        
mask1 = np.isnan(matrix)
mask2 = np.isnan(coefficients)

# Perform comparison between each row of mask1 against every row of mask2
# leading to a 3D array. Look for all-matching ones along the last axis.
# These are the ones that shows the row matches between the two input arrays - 
# matrix and coefficients. Then, we use find the corresponding matching 
# indices that gives us the pair of matches betweel those two arrays
r,c = np.nonzero((mask1[:,None] == mask2).all(-1))

# Index into arrays with those indices and perform elementwise multiplication
out = matrix[r] * coefficients[c]

给定样本数据的输出 -

In [40]: out
Out[40]: 
array([[ 0.3,  0.6,  0.6,  nan],
       [ 0.5,  nan,  0.6,  1.2],
       [ nan,  0.4,  0.3,  nan]])

方法#2

为了提高性能,将每行NaNs掩码减少到其十进制等值,然后创建一个存储数组,我们可以在其中存储元素matrix,然后乘以coefficients之后的元素,这些元素由小数点后跟等价物 -

R = 2**np.arange(matrix.shape[1])
idx1 = mask1.dot(R)
idx2 = mask2.dot(R)

A = np.empty((idx1.max()+1, matrix.shape[1]))
A[idx1] = matrix
A[idx2] *= coefficients
out = A[idx1]