我想阅读之前使用PyTables
创建的h5文件。
使用Pandas
读取文件,并使用某些条件,如下所示:
pd.read_hdf('myH5file.h5', 'anyTable', where='some_conditions')
从另一个问题,我被告知,为了使h5文件具有read_hdf's where
参数“可查询”,必须在table format
中写入,此外,某些列必须是声明为data columns
。
我在PyTables文档中找不到任何相关内容。
关于PyTable的create_table
方法的文档没有说明任何内容。
所以,现在,如果我尝试在使用PyTables创建的h5文件中使用类似的东西,我会得到以下内容:
>>> d = pd.read_hdf('test_file.h5','basic_data', where='operation==1')
C:\Python27\lib\site-packages\pandas\io\pytables.py:3070: IncompatibilityWarning:
where criteria is being ignored as this version [0.0.0] is too old (or
not-defined), read the file in and write it out to a new file to upgrade (with
the copy_to method)
warnings.warn(ws, IncompatibilityWarning)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python27\lib\site-packages\pandas\io\pytables.py", line 323, in read_hdf
return f(store, True)
File "C:\Python27\lib\site-packages\pandas\io\pytables.py", line 305, in <lambda>
key, auto_close=auto_close, **kwargs)
File "C:\Python27\lib\site-packages\pandas\io\pytables.py", line 665, in select
return it.get_result()
File "C:\Python27\lib\site-packages\pandas\io\pytables.py", line 1359, in get_result
results = self.func(self.start, self.stop, where)
File "C:\Python27\lib\site-packages\pandas\io\pytables.py", line 658, in func
columns=columns, **kwargs)
File "C:\Python27\lib\site-packages\pandas\io\pytables.py", line 3968, in read
if not self.read_axes(where=where, **kwargs):
File "C:\Python27\lib\site-packages\pandas\io\pytables.py", line 3196, in read_axes
values = self.selection.select()
File "C:\Python27\lib\site-packages\pandas\io\pytables.py", line 4482, in select
start=self.start, stop=self.stop)
File "C:\Python27\lib\site-packages\tables\table.py", line 1567, in read_where
self._where(condition, condvars, start, stop, step)]
File "C:\Python27\lib\site-packages\tables\table.py", line 1528, in _where
compiled = self._compile_condition(condition, condvars)
File "C:\Python27\lib\site-packages\tables\table.py", line 1366, in _compile_condition
compiled = compile_condition(condition, typemap, indexedcols)
File "C:\Python27\lib\site-packages\tables\conditions.py", line 430, in compile_condition
raise _unsupported_operation_error(nie)
NotImplementedError: unsupported operand types for *eq*: int, bytes
修改
回溯提到了IncompatibilityWarning和版本[0.0.0],但是如果我检查我的Pandas和Tables版本,我会得到:
>>> import pandas
>>> pandas.__version__
'0.15.2'
>>> import tables
>>> tables.__version__
'3.1.1'
所以,我完全糊涂了。
答案 0 :(得分:0)
我遇到了同样的问题,这就是我所做的。
通过pandas.read_hdf读取此HDF5文件并使用“where = where_string,columns = selected_columns”等参数
我收到如下警告消息和其他错误消息:
d:\程序 文件\ Anaconda3 \ LIB \站点包\大熊猫\ IO \ pytables.py:3065: 不兼容性警告:标准被忽略的地方就是这样 版本[0.0.0]太旧(或未定义),在和中读取文件 将其写入要升级的新文件(使用copy_to方法)
warnings.warn(ws,IncompatibilityWarning)
我尝试过这样的命令:
hdf5_store = pd.HDFStore(hdf5_file,mode ='r')
h5cpt_store_new = hdf5_store.copy(hdf5_new_file,complevel = 9,complib ='blosc') h5cpt_store_new.close()
完全按照步骤2运行命令,它可以正常工作。
大熊猫。的版本强> '0.17.1'
表。的版本强> '3.2.2'