IOError:无法读取数据(无法打开目录) - 缺少gzip压缩过滤器

时间:2014-10-10 14:06:06

标签: python linux hdf5 anaconda h5py

我之前从未使用过HDF5文件,为了开始我收到了一些示例文件。我一直在查看h5py的所有基础知识,查看这些文件中的不同组,名称,键,值等。一切正常,直到我想查看组中保存的数据集。我得到了他们的.shape.dtype,但是当我尝试通过索引(例如grp["dset"][0])访问随机值时,我收到以下错误:

IOError                                   Traceback (most recent call last)
<ipython-input-45-509cebb66565> in <module>()
      1 print geno["matrix"].shape
      2 print geno["matrix"].dtype
----> 3 geno["matrix"][0]

/home/sarah/anaconda/lib/python2.7/site-packages/h5py/_hl/dataset.pyc in __getitem__(self, args)
    443         mspace = h5s.create_simple(mshape)
    444         fspace = selection._id
--> 445         self.id.read(mspace, fspace, arr, mtype)
    446
    447         # Patch up the output for NumPy

/home/sarah/anaconda/lib/python2.7/site-packages/h5py/h5d.so in h5py.h5d.DatasetID.read (h5py/h5d.c:2782)()

/home/sarah/anaconda/lib/python2.7/site-packages/h5py/_proxy.so in h5py._proxy.dset_rw (h5py/_proxy.c:1709)()

/home/sarah/anaconda/lib/python2.7/site-packages/h5py/_proxy.so in h5py._proxy.H5PY_H5Dread (h5py/_proxy.c:1379)()

IOError: Can't read data (Can't open directory)

我已在h5py Google group中发布了此问题,其中建议可能在我尚未安装的数据集上有过滤器。但据我所知,HDF5文件仅使用gzip压缩创建,这应该是一个便携式标准 有人知道我在这里可能会缺少什么吗?我甚至无法在任何地方找到此错误或类似问题的描述,并且可以使用HDFView软件轻松打开文件,包括有问题的数据集。

修改
显然,出现此错误是因为由于某种原因,我的系统上没有gzip压缩过滤器。如果我尝试使用gzip压缩创建一个示例文件,则会发生这种情况:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-33-dd7b9e3b6314> in <module>()
      1 grp = f.create_group("subgroup")
----> 2 grp_dset = grp.create_dataset("dataset", (50,), dtype="uint8", chunks=True, compression="gzip")

/home/sarah/anaconda/lib/python2.7/site-packages/h5py/_hl/group.pyc in create_dataset(self, name, shape, dtype, data, **kwds)
     92         """
     93 
---> 94         dsid = dataset.make_new_dset(self, shape, dtype, data, **kwds)
     95         dset = dataset.Dataset(dsid)
     96         if name is not None:

/home/sarah/anaconda/lib/python2.7/site-packages/h5py/_hl/dataset.pyc in make_new_dset(parent, shape, dtype, data, chunks, compression, shuffle, fletcher32, maxshape, compression_opts, fillvalue, scaleoffset, track_times)
     97 
     98     dcpl = filters.generate_dcpl(shape, dtype, chunks, compression, compression_opts,
---> 99                   shuffle, fletcher32, maxshape, scaleoffset)
    100 
    101     if fillvalue is not None:

/home/sarah/anaconda/lib/python2.7/site-packages/h5py/_hl/filters.pyc in generate_dcpl(shape, dtype, chunks, compression, compression_opts, shuffle, fletcher32, maxshape, scaleoffset)
    101 
    102         if compression not in encode:
--> 103             raise ValueError('Compression filter "%s" is unavailable' % compression)
    104 
    105         if compression == 'gzip':

ValueError: Compression filter "gzip" is unavailable

有没有人有这方面的经验? HDF5库以及h5py软件包的安装似乎没有出错...

5 个答案:

答案 0 :(得分:1)

无法评论 - 声誉太低。

我有同样的问题,只是跑了&#34; conda update anaconda&#34;问题就消失了。

答案 1 :(得分:1)

我有类似的问题,

$ python3 -c 'import h5py; f=h5py.File("file.h5"); d=f["FVC"][:,:]'                                                               

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper (/home/ilan/minonda/conda-bld/work/h5py/_objects.c:2696)
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper (/home/ilan/minonda/conda-bld/work/h5py/_objects.c:2654)
  File "/home/pinaultf/system/anaconda2/envs/deveg-dev/lib/python3.5/site-packages/h5py/_hl/dataset.py", line 482, in __getitem__
    self.id.read(mspace, fspace, arr, mtype, dxpl=self._dxpl)
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper (/home/ilan/minonda/conda-bld/work/h5py/_objects.c:2696)
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper (/home/ilan/minonda/conda-bld/work/h5py/_objects.c:2654)
  File "h5py/h5d.pyx", line 181, in h5py.h5d.DatasetID.read (/home/ilan/minonda/conda-bld/work/h5py/h5d.c:3240)
  File "h5py/_proxy.pyx", line 130, in h5py._proxy.dset_rw (/home/ilan/minonda/conda-bld/work/h5py/_proxy.c:1869)
  File "h5py/_proxy.pyx", line 84, in h5py._proxy.H5PY_H5Dread (/home/ilan/minonda/conda-bld/work/h5py/_proxy.c:1517)
OSError: Can't read data (Can't open directory)

我在一个虚拟环境中遇到此问题而在另一个虚拟环境中没有,即使显然h5py版本相同(2.6.0)。

已解决此问题:

$ pip uninstall h5py
$ pip install h5py

答案 2 :(得分:0)

我遇到了同样的问题。我用

解决了这个问题

import tables

现在工作正常

答案 3 :(得分:0)

h5py 找不到打开文件所需的插件时会发生此错误。对于很多常见的插件,这可以通过添加来解决:

import hdf5plugin

在使用 h5py 库之前。您不必直接使用 hdf5plugin 库,您只需导入它。根据文件使用的插件,不同的导入可能适合您 - 如果此错误消息更具描述性会有所帮助。

答案 4 :(得分:-1)

我将 python 3.6 切换到 python 3.8 和“导入表”,解决了。