无法从HDF文件读取SparseDataFrame

时间:2018-10-05 03:33:28

标签: python pandas

这是创建csr_matrix的示例代码,然后我将其转换为SparseDataFrame并写入hdf5文件。

from scipy import sparse
from numpy import array
import pandas as pd
I = array([0,3,1,0])
J = array([0,3,1,2])
V = array([4,5,7,9])
A = sparse.coo_matrix((V,(I,J)),shape=(4,4))

然后按如下所示将其写入hdf5文件。

df = pd.SparseDataFrame(A)
df.to_hdf("/tmp/tmp.hdf", "my_data")

现在,如果我尝试将其读回,它将引发异常-“ NotImplementedError:固定的稀疏读取不支持开始和/或停止”。这很奇怪(好像是个错误),因为我能够以固定的稀疏格式编写但不能读取它。

df2 = pd.read_hdf("/tmp/tmp.hdf", "my_data")
Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "/Users/speaktribe/.virtualenvs/domain_classifier-g6h5ez5L/lib/python3.6/site-packages/pandas/io/pytables.py", line 394, in read_hdf
    return store.select(key, auto_close=auto_close, **kwargs)
  File "/Users/speaktribe/.virtualenvs/domain_classifier-g6h5ez5L/lib/python3.6/site-packages/pandas/io/pytables.py", line 741, in select
    return it.get_result()
  File "/Users/speaktribe/.virtualenvs/domain_classifier-g6h5ez5L/lib/python3.6/site-packages/pandas/io/pytables.py", line 1483, in get_result
    results = self.func(self.start, self.stop, where)
  File "/Users/speaktribe/.virtualenvs/domain_classifier-g6h5ez5L/lib/python3.6/site-packages/pandas/io/pytables.py", line 734, in func
    columns=columns)
  File "/Users/speaktribe/.virtualenvs/domain_classifier-g6h5ez5L/lib/python3.6/site-packages/pandas/io/pytables.py", line 2855, in read
    kwargs = self.validate_read(kwargs)
  File "/Users/speaktribe/.virtualenvs/domain_classifier-g6h5ez5L/lib/python3.6/site-packages/pandas/io/pytables.py", line 2821, in validate_read
    raise NotImplementedError("start and/or stop are not supported "
NotImplementedError: start and/or stop are not supported in fixed Sparse reading

关于如何克服这一问题的任何建议?

我也尝试过以“表”格式编写,但是在这里写入失败。

    df.to_hdf("/tmp/tmp.hdf", "my_data", format='table')
Traceback (most recent call last):
  File "/Users/speaktribe/.virtualenvs/domain_classifier-g6h5ez5L/lib/python3.6/site-packages/pandas/io/pytables.py", line 1313, in _create_storer
    return globals()[_TABLE_MAP[tt]](self, group, **kwargs)
KeyError: None
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "/Users/speaktribe/.virtualenvs/domain_classifier-g6h5ez5L/lib/python3.6/site-packages/pandas/core/generic.py", line 1996, in to_hdf
    return pytables.to_hdf(path_or_buf, key, self, **kwargs)
  File "/Users/speaktribe/.virtualenvs/domain_classifier-g6h5ez5L/lib/python3.6/site-packages/pandas/io/pytables.py", line 279, in to_hdf
    f(store)
  File "/Users/speaktribe/.virtualenvs/domain_classifier-g6h5ez5L/lib/python3.6/site-packages/pandas/io/pytables.py", line 273, in <lambda>
    f = lambda store: store.put(key, value, **kwargs)
  File "/Users/speaktribe/.virtualenvs/domain_classifier-g6h5ez5L/lib/python3.6/site-packages/pandas/io/pytables.py", line 890, in put
    self._write_to_group(key, value, append=append, **kwargs)
  File "/Users/speaktribe/.virtualenvs/domain_classifier-g6h5ez5L/lib/python3.6/site-packages/pandas/io/pytables.py", line 1349, in _write_to_group
    encoding=encoding, **kwargs)
  File "/Users/speaktribe/.virtualenvs/domain_classifier-g6h5ez5L/lib/python3.6/site-packages/pandas/io/pytables.py", line 1315, in _create_storer
    error('_TABLE_MAP')
  File "/Users/speaktribe/.virtualenvs/domain_classifier-g6h5ez5L/lib/python3.6/site-packages/pandas/io/pytables.py", line 1239, in error
    % (t, group, type(value), format, append, kwargs)
TypeError: cannot properly create the storer for: [_TABLE_MAP] [group->/my_data (Group) '',value-><class 'pandas.core.sparse.frame.SparseDataFrame'>,format->table,append->False,kwargs->{'encoding': None}]

0 个答案:

没有答案