Question

假设我有一个稀疏矩阵c和一个numpy数组a。我想根据a上的某些条件对c的条目进行切片。

import scipy.sparse as sps
import numpy as np
x = np.array([1,0,0,1])
y = np.array([0,0,0,1])
c = sps.csc_matrix( (np.ones((4,)) , (x,y)), shape = (2,2),dtype=int)
a = np.array([ [1,2],[3,3]])
idx = c != 0

变量idx现在是一个稀疏的布尔矩阵（它只列出了True）。我想对矩阵a进行切片，并调用a c != 0所在的相同条目。

c[idx]

工作正常，但以下情况不起作用：

a[idx]

我可以使用idx.todense()，但我发现这些.todense()函数占用的内存过多......

Answer 1

您可以通过获取a非零的行和列的索引来索引c。您可以通过将c转换为COO矩阵并使用row和col属性来实现这一目标。

以下是一些示例数据：

In [41]: a
Out[41]: 
array([[10, 11, 12, 13],
       [14, 15, 16, 17],
       [18, 19, 20, 21],
       [22, 23, 24, 25]])

In [42]: c
Out[42]: 
<4x4 sparse matrix of type '<type 'numpy.int64'>'
    with 4 stored elements in Compressed Sparse Column format>

In [43]: c.A
Out[43]: 
array([[0, 0, 1, 0],
       [0, 0, 0, 0],
       [1, 0, 1, 0],
       [0, 0, 0, 1]])

将c转换为COO格式：

In [45]: c2 = c.tocoo()

In [46]: c2
Out[46]: 
<4x4 sparse matrix of type '<type 'numpy.int64'>'
    with 4 stored elements in COOrdinate format>

In [47]: c2.row
Out[47]: array([2, 0, 2, 3], dtype=int32)

In [48]: c2.col
Out[48]: array([0, 2, 2, 3], dtype=int32)

现在使用a和c2.row对c2.col进行索引，以便在a非零的位置获取c的值：

In [49]: a[c2.row, c2.col]
Out[49]: array([18, 12, 20, 25])

但请注意，值的顺序与a[idx.A]：

不同

In [50]: a[(c != 0).A]
Out[50]: array([12, 18, 20, 25])

顺便说一下，这种a的索引不是＆＃34;切片＆＃34;。切片是指使用切片符号a（或者，通常使用内置切片对象start:stop:step）创建的＆＃34;切片＆＃34;索引slice(start, stop, step)，例如， a[1:3, :2]。你正在做什么有时被称为＆＃34;高级＆＃34;索引（例如http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html）。

使用稀疏矩阵切割numpy数组

1 个答案: