我希望能够对scipy稀疏矩阵的列进行排序。 scipy文档相当简洁,我看不出有关矩阵修改的问题。在SO上我找到了post,但给出的答案返回list
我想写的代码是
s = rand(4, 4, density=0.25, format='csc')
_,colSize = s.get_shape()
for j in range(0,colSize):
s.setcol(j, sorted(s.getcol(j), key=attrgetter('data'), reverse=True))
除了setcol
之外,sorted
不会返回与getcol
相同的类型。
作为我想要获得的一个例子,如果我有输入
<class 'scipy.sparse.csc.csc_matrix'>
[[ 0. 0.33201655 0. 0. ]
[ 0. 0. 0. 0. ]
[ 0. 0.81332962 0. 0.50794041]
[ 0. 0.41478979 0. 0. ]]
那么我想要的输出是
[[ 0. 0.81332962 0. 0.50794041]
[ 0. 0.414789790. 0. 0. ]
[ 0. 0.332016550. 0. 0. ]
[ 0. 0. 0. 0. ]]
(它不一定是csc矩阵,我认为这对于列操作会更好)
答案 0 :(得分:2)
这是一个简短的函数,按照原样的降序对列进行排序:
import numpy as np
def sort_csc_cols(m):
"""
Sort the columns of m in descending order.
m must be a csc_matrix whose nonzero values are all positive.
m is modified in-place.
"""
seq = np.arange(m.shape[0])
for k in range(m.indptr.size - 1):
start, end = m.indptr[k:k + 2]
m.data[start:end][::-1].sort()
m.indices[start:end] = seq[:end - start]
例如,s
是csc_matrix
:
In [47]: s
Out[47]:
<8x12 sparse matrix of type '<class 'numpy.int64'>'
with 19 stored elements in Compressed Sparse Column format>
In [48]: s.A
Out[48]:
array([[ 0, 2, 0, 0, 7, 0, 0, 48, 0, 0, 0, 0],
[ 0, 0, 82, 0, 0, 38, 67, 17, 9, 0, 0, 0],
[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 47, 0],
[ 0, 0, 0, 0, 0, 0, 99, 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, 0, 0, 0, 83, 0, 0, 0, 9],
[ 0, 0, 0, 0, 0, 0, 85, 94, 0, 55, 68, 0],
[ 0, 0, 0, 0, 0, 0, 22, 0, 0, 0, 71, 0],
[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])
In [49]: sort_csc_cols(s)
In [50]: s.A
Out[50]:
array([[ 0, 2, 82, 0, 7, 38, 99, 94, 9, 55, 71, 9],
[ 0, 0, 0, 0, 0, 0, 85, 83, 0, 0, 68, 0],
[ 0, 0, 0, 0, 0, 0, 67, 48, 0, 0, 47, 0],
[ 0, 0, 0, 0, 0, 0, 22, 17, 0, 0, 0, 0],
[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])