Question

我有一个csr_matrix，让我说我打电话：

import scipy.sparse as ss
mat = ss.csr.csr_matrix((50, 100))

现在我想修改这个矩阵的一些值。我打电话给：

mat[0,1]+=1

我得到了：

SparseEfficiencyWarning: Changing the sparsity structure of a csr_matrix is expensive. lil_matrix is more efficient.

我只需要在创建矩阵之后设置一些值（最后在矩阵的比例下）。稍后我将只读取列或对整个矩阵进行逐元素操作（如.log1p()）

这样做的正确方法是什么？目前我可以忽略警告，但可能有更好的方法，不会产生警告。

Answer 1

您可以控制警告的外观。默认设置是在运行期间显示一次，然后保持沉默。您可以更改它以引发错误，完全保持沉默，或每次都发出警告。

创建稀疏矩阵的常用方法是创建3个coo样式数组，其中包含所有非零值。然后直接创建一个coo矩阵或csr（它采用相同的输入方式）。

coo格式没有索引，因此无论如何都无法执行M[i,j]=1。但是csr确实实现了它。我认为警告是为了阻止多次更改（在一个循环中）而不是一两次。

更改csr矩阵的稀疏性需要重新计算整个属性集（数据和索引指针）。这就是它昂贵的原因。我还没有完成计时，但它可能几乎和阵列新鲜一样昂贵。

lil应该更适合增量分配。它将数据保存在列表列表中，并且将值插入列表的速度很快。但是将csr转换为lil并将其转回需要花费时间，因此我不会仅仅添加一些内容。

Answer 2

代替：

from scipy.sparse import csr_matrix

# Create sparse matrix.
graph = csr_matrix((10, 10))
# Change sparse matrix.
graph[(1, 1)] = 0      # --- SLOW --- ^1
# Do some calculations.
graph += graph

或者：

from scipy.sparse import lil_matrix

# Create sparse matrix.
graph = lil_matrix((10, 10))
# Change sparse matrix.
graph[(1, 1)] = 0
# Do some calculations.
graph += graph         # --- SLOW --- ^2

结合两者的优势：

from scipy.sparse import csr_matrix, lil_matrix

# Create sparse matrix.
graph = lil_matrix((10, 10))
# Change sparse matrix.
graph[(1, 1)] = 0
# Done with changes to graph. Convert to csr.
graph = csr_matrix(graph)
# Do some calculations.
graph += graph

不要将“--- SLOW ---”视为万能的诫命！这只是一个警告，在使用某些数据集时，您应该意识到可能有更快、更有效的做事方式。对于其他数据集，这只会使您的代码更难阅读和维护，而没有任何性能优势。

1：根据警告“慢”：

<块引用>

/venv/lib/python3.8/site-packages/scipy/sparse/_index.py:82: SparseEfficiencyWarning: Changing the sparsity structure of a csr_matrix is expensive. lil_matrix is more efficient.

2：根据 docs 中的警告“慢”：

<块引用>

LIL 格式的缺点：
算术运算 LIL + LIL 很慢（考虑 CSR 或 CSC）

将元素添加到csr_matrix的正确方法是什么？

2 个答案: