我用csr
格式创建了一个巨大的稀疏矩阵,出于某种原因,我需要遍历行(随机)和dot
操作,我发现代码要慢得多比使用密集阵列,这是基准。
In [1]: a = sp.csr_matrix(np.random.rand(10000, 10000))
In [2]: b = a.todense()
In [126]: %timeit a[1357]
10000 loops, best of 3: 78.1 µs per loop
In [127]: %timeit b[1357]
The slowest run took 6.80 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 2.49 µs per loop
密集阵列行索引比csr_matrix
快约30倍,我是否正确行事,以及如何改进它?