为什么np.load不返回由np.save保存的可用csr_matrix

时间:2013-08-27 18:44:50

标签: python python-3.x numpy scipy sparse-matrix

如果我使用 numpy.save()保存CSR矩阵,然后尝试通过 numpy.load()加载它,大量属性消失:特别是没有形状,并且不可能通过索引访问值。这是正常的吗?

在下面的示例中,我从三个数组创建一个CSR矩阵:数据,索引和索引指针。然后我保存它,加载它,并演示保存版本的形状和索引操作失败。

> import numpy as np
> import scipy as sp
> import scipy.sparse as ssp

> wd
Out[1]: 
array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1], dtype=int16)

> wi
Out[1]: 
array([200003,      1, 200009, 300000, 200002, 200006, 200007, 250000,
       300500, 200010, 300501, 200001, 200000,      0, 200008, 200004,
       200005, 200011, 200018,      2, 200019, 200013, 300001, 200014,
       200015, 200022, 200012, 200020, 200021, 200016, 200017, 200023,
       200027,      2, 200030, 200032, 200028, 200033, 200031, 200029,
       200026, 200025, 200024, 200047,      2, 200042, 200045, 200046,
       200028, 200038, 200040, 200039, 200036, 200037, 200012, 200048,
       200041, 200035, 200044, 200043, 200034, 200049,      3, 200050,
            4], dtype=int32)

> wp
Out[1]: array([ 0, 18, 31, 43, 61, 65], dtype=int32)

> ww = ssp.csr_matrix((wd,wi,wp))

> ww.shape
Out[1]: (5, 300502)

> ww[2,3]
Out[1]: 0

> ww[0,0]
Out[1]: 1

> np.save('/Users/bryanfeeney/Desktop/ww.npy', ww)
> www = np.load('/Users/bryanfeeney/Desktop/ww.npy')

> www.shape
Out[1]: ()

> www[2,3]
Traceback (most recent call last):
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/site-packages/IPython/core/interactiveshell.py", line 2732, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-1-35f1349fb755>", line 1, in <module>
    www[2,3]
IndexError: 0-d arrays can only use a single () or a list of newaxes (and a single ...) as an index

> www[0,0]
Traceback (most recent call last):
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/site-packages/IPython/core/interactiveshell.py", line 2732, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-1-43c5da404060>", line 1, in <module>
    www[0,0]
IndexError: 0-d arrays can only use a single () or a list of newaxes (and a single ...) as an index

这是python运行时的版本信息,分别是numpy和scipy。

> sys.version
Out[1]: '3.3.2 (default, May 21 2013, 11:50:47) \n[GCC 4.2.1 Compatible Apple Clang 4.1 ((tags/Apple/clang-421.11.66))]'

> np.__version__
Out[1]: '1.7.1'

> sp.__version__
Out[1]: '0.12.0'

2 个答案:

答案 0 :(得分:0)

三个变量wdwiwp组成了稀疏矩阵。你需要保存所有这三个,因为numpy save处理numpy数组。
然后加载它们,比如wwd,wwi和wwp制作一个新的矩阵

new_csr = csr_matrix((wwd, wwi, wwp), shape=(M, N))

有关类似的讨论,请参阅here

答案 1 :(得分:0)

这似乎是一个错误,但你可以腌制整个稀疏矩阵对象:

import pickle
with open('ww.pkl', 'w') as f:
    pickle.dump(w, f)

当你想加载时:

with open('ww.pkl') as f:
    ww = pickle.load(f)