是否有一种方法可以强制熊猫将空的DataFrame写入HDF文件?
import pandas as pd
df = pd.DataFrame(columns=['x','y'])
df.to_hdf('temp.h5', 'xxx')
df2 = pd.read_hdf('temp.h5', 'xxx')
输出:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File ".../Python-3.6.3/lib/python3.6/site-packages/pandas/io/pytables.py", line 389, in read_hdf
return store.select(key, auto_close=auto_close, **kwargs)
File ".../Python-3.6.3/lib/python3.6/site-packages/pandas/io/pytables.py", line 740, in select
return it.get_result()
File ".../Python-3.6.3/lib/python3.6/site-packages/pandas/io/pytables.py", line 1518, in get_result
results = self.func(self.start, self.stop, where)
File ".../Python-3.6.3/lib/python3.6/site-packages/pandas/io/pytables.py", line 733, in func
columns=columns)
File ".../Python-3.6.3/lib/python3.6/site-packages/pandas/io/pytables.py", line 2986, in read
idx=i), start=_start, stop=_stop)
File ".../Python-3.6.3/lib/python3.6/site-packages/pandas/io/pytables.py", line 2575, in read_index
_, index = self.read_index_node(getattr(self.group, key), **kwargs)
File ".../Python-3.6.3/lib/python3.6/site-packages/pandas/io/pytables.py", line 2676, in read_index_node
data = node[start:stop]
File ".../Python-3.6.3/lib/python3.6/site-packages/tables/vlarray.py", line 675, in __getitem__
return self.read(start, stop, step)
File ".../Python-3.6.3/lib/python3.6/site-packages/tables/vlarray.py", line 811, in read
listarr = self._read_array(start, stop, step)
File "tables/hdf5extension.pyx", line 2106, in tables.hdf5extension.VLArray._read_array (tables/hdf5extension.c:24649)
ValueError: cannot set WRITEABLE flag to True of this array
用format='table'
书写:
import pandas as pd
df = pd.DataFrame(columns=['x','y'])
df.to_hdf('temp.h5', 'xxx', format='table')
df2 = pd.read_hdf('temp.h5', 'xxx')
输出:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File ".../Python-3.6.3/lib/python3.6/site-packages/pandas/io/pytables.py", line 389, in read_hdf
return store.select(key, auto_close=auto_close, **kwargs)
File ".../Python-3.6.3/lib/python3.6/site-packages/pandas/io/pytables.py", line 722, in select
raise KeyError('No object named {key} in the file'.format(key=key))
KeyError: 'No object named xxx in the file'
熊猫版本:0.24.2
谢谢您的帮助!
答案 0 :(得分:0)
以HDFStore
格式将空的DataFrame放入fixed
中应该可行(也许您需要检查其他软件包的版本,例如tables
):
# Versions
pd.__version__
tables.__version__
# DF
df = pd.DataFrame(columns=['x','y'])
df
# Dump in fixed format
with pd.HDFStore('temp.h5') as store:
store.put('df', df, format='f')
print('Read:')
store.select('df')
>>> '0.24.2'
>>> '3.5.1'
>>> x y
>>>
>>> Read:
>>> x y
Pytable确实禁止这样做(至少是这样),但是对于fixed
熊猫来说,它有workaround。
但是,正如在同一个github问题中所讨论的那样,还付出了一些努力来修复table
的这种行为。但是看来解决方案仍然“悬而未决”,因为在march的结尾处是如此。