Question

我已经用Python编写了这段代码，它给出了预期的结果，但速度非常慢。瓶颈是scipy.sparse.lil_matrix的多行总和。我怎样才能快速完成？

# D1 is a 1.5M x 1.3M sparse matrix, read as scipy.sparse.lil_matrix.
# D2 is a 1.5M x 111 matrix, read as numpy.array
# F1 is a csv file, read using csv.reader

for row in F1:
    user_id = row[0]
    clust = D2[user_id, 110]
    neighbors = D2[ D2[:, 110] == clust][:,1]
    score = np.zeros(1300000)

    for neigh in neighbors:
        score = score + D1 [neigh, :] # the most expensive operation

    toBeWritten = np.argsort(score)[:,::-1].A[0,:]

请告诉我是否还有其他不太理想的事情。

Answer 1

首先使用非常小的矩阵进行演示

In [523]: idx=np.arange(0,8,2)
In [526]: D=np.arange(24).reshape(8,3)
In [527]: Dll=sparse.lil_matrix(D)

In [528]: D[idx,:].sum(axis=0)
Out[528]: array([36, 40, 44])

In [529]: Dll[idx,:].sum(axis=0)
Out[529]: matrix([[36, 40, 44]], dtype=int32)

In [530]: timeit D[idx,:].sum(axis=0)
100000 loops, best of 3: 17.3 µs per loop

In [531]: timeit Dll[idx,:].sum(axis=0)
1000 loops, best of 3: 1.16 ms per loop

In [532]: score=np.zeros(3)      # your looping version
In [533]: for i in idx:
   .....:     score = score + Dll[i,:]

In [534]: score
Out[534]: matrix([[ 36.,  40.,  44.]])

In [535]: %%timeit
   .....: score=np.zeros(3)
   .....: for i in idx:
    score = score + Dll[i,:]
   .....: 
100 loops, best of 3: 2.76 ms per loop

对于某些操作，csr格式更快：

In [537]: timeit Dll.tocsr()[idx,:].sum(axis=0)
1000 loops, best of 3: 955 µs per loop

或者如果我预转换为csr：

In [538]: Dcsr=Dll.tocsr()

In [539]: timeit Dcsr[idx,:].sum(axis=0)
1000 loops, best of 3: 724 µs per loop

相对于密集而言仍然缓慢。

我将讨论如何使用稀疏矩阵的数据属性来更快地选择行。但是，如果选择这些行的唯一目的是将它们的值相加，我们就不需要这样做了。

通过使用列或行矩阵的矩阵乘积来对行或列进行稀疏矩阵求和。我只是用同样的答案回答了另一个问题。

https://stackoverflow.com/a/37120235/901925 Efficiently compute columnwise sum of sparse array where every non-zero element is 1

例如：

In [588]: I=np.asmatrix(np.zeros((1,Dll.shape[0])))    
In [589]: I[:,idx]=1
In [590]: I
Out[590]: matrix([[ 1.,  0.,  1.,  0.,  1.,  0.,  1.,  0.]])
In [591]: I*Dll
Out[591]: matrix([[ 36.,  40.,  44.]])

In [592]: %%timeit 
I=np.asmatrix(np.zeros((1,Dll.shape[0])))
I[:,idx]=1
I*Dll
   .....: 
1000 loops, best of 3: 919 µs per loop

对于这个小矩阵，它没有帮助速度，但Dcsr时间下降到215 µs（它对数学来说好得多）。对于大型矩阵，此产品版本将得到改进。

=================

我刚刚在另一个问题中发现，A_csr[[1,1,0,3],:]行选择实际上是使用矩阵产品完成的。它构造了一个＆＃39;提取器＆＃39;看起来像

的csr矩阵

matrix([[0, 1, 0, 0],
       [0, 1, 0, 0],
       [1, 0, 0, 0],
       [0, 0, 0, 1]])

https://stackoverflow.com/a/37245105/901925

Python中稀疏LIL矩阵中的极慢行和运算

1 个答案: