将稀疏数组中的元素与矩阵中的行相乘

时间:2012-09-02 16:57:55

标签: python matrix numpy scipy sparse-matrix

如果你有一个稀疏矩阵X:

>> X = csr_matrix([[0,2,0,2],[0,2,0,1]])
>> print type(X)    
>> print X.todense()    
<class 'scipy.sparse.csr.csr_matrix'>
[[0 2 0 2]
 [0 2 0 1]]

矩阵Y:

>> print type(Y)
>> print text_scores
<class 'numpy.matrixlib.defmatrix.matrix'>
[[8]
 [5]]

...如何将X的每个元素乘以Y的行。例如:

[[0*8 2*8 0*8 2*8]
 [0*5 2*5 0*5 1*5]]

或:

[[0 16 0 16]
 [0 10 0 5]]

我已经厌倦了这一点,但显然它不起作用,因为尺寸不匹配:  Z = X.data * Y

3 个答案:

答案 0 :(得分:8)

不幸的是,如果另一个密集的话,CSR矩阵的.multiply方法似乎会使矩阵变得密集。所以这将是避免这种情况的一种方式:

# Assuming that Y is 1D, might need to do Y = Y.A.ravel() or such...

# just to make the point that this works only with CSR:
if not isinstance(X, scipy.sparse.csr_matrix):
    raise ValueError('Matrix must be CSR.')

Z = X.copy()
# simply repeat each value in Y by the number of nnz elements in each row: 
Z.data *= Y.repeat(np.diff(Z.indptr))

这确实会产生一些临时性,但至少它是完全矢量化的,并且它不会使稀疏矩阵变得密集。


对于COO矩阵,等价物是:

Z.data *= Y[Z.row] # you can use np.take which is faster then indexing.

对于CSC矩阵,等价物将是:

Z.data *= Y[Z.indices]

答案 1 :(得分:1)

我用于执行行(分别为列)乘法的方法是使用矩阵乘法,左侧为对角矩阵(右侧为):

import numpy as np
import scipy.sparse as sp

X = sp.csr_matrix([[0,2,0,2],
                   [0,2,0,1]])
Y = np.array([8, 5])

D = sp.diags(Y) # produces a diagonal matrix which entries are the values of Y
Z = D.dot(X) # performs D @ X, multiplication on the left for row-wise action

保留稀疏性(以CSR格式):

print(type(Z))
>>> <class 'scipy.sparse.csr.csr_matrix'>

输出也正确:

print(Z.toarray()) # Z is still sparse and gives the right output
>>> print(Z.toarray()) # Z is still sparse and gives the right output
[[ 0. 16.  0. 16.]
 [ 0. 10.  0.  5.]]

答案 2 :(得分:0)

我有同样的问题。我个人并不认为scipy.sparse的文档很有帮助,也没有找到直接处理它的函数。因此,我尝试自己编写它,这对我来说解决了:

Z = X.copy()
for row_y_idx in range(Y.shape[0]):
    Z.data[Z.indptr[row_y_idx]:Z.indptr[row_y_idx+1]] *= Y[row_y_idx, 0]

这个想法是:对于第Y位的row_y_idx的每个元素,与row_y_idx的第X行进行标量乘法。有关访问CSR矩阵here(其中dataAIAindptr)中的元素的更多信息。

给出您定义的XY

import numpy as np
import scipy.sparse as sps

X = sps.csr_matrix([[0,2,0,2],[0,2,0,1]])
Y = np.matrix([[8], [5]])

Z = X.copy()
for row_y_idx in range(Y.shape[0]):
    Z.data[Z.indptr[row_y_idx]:Z.indptr[row_y_idx+1]] *= Y[row_y_idx, 0]

print(type(Z))
print(Z.todense())

输出与您的输出相同:

<class 'scipy.sparse.csr.csr_matrix'>
 [[ 0 16  0 16]
  [ 0 10  0  5]]