在python

时间:2015-11-19 13:20:16

标签: python numpy hdf5 h5py

我正在尝试将维度比例附加到我想用python存储在hdf5文件中的数据集中,但在设置后尝试打印属性时会出现错误。相关的代码段如下:

import h5py
import numpy as np

# create data and x-axis
my_data = np.random.randint(10, size=(100, 200))
x_axis  = np.linspace(0, 1, 100)

h5f = h5.File('my_file.h5','w')
h5f.create_dataset( 'data_1', data=my_data )
h5f['data_1'].dims[0].label = 'm'
h5f['data_1'].dims.create_scale( h5f['x_axis'], 'x' )

# the following line is creating the problems
h5f['data_1'].dims[0].attach_scale( h5f['x_axis'] )

# this is where the crash happens but only if the above line is included
for ii in h5f['data_1'].attrs.items():
    print ii

h5f.close()

命令print(h5.version.info)打印以下输出:

Summary of the h5py configuration
---------------------------------

h5py    2.2.1
HDF5    1.8.11
Python  2.7.6 (default, Jun 22 2015, 17:58:13) 
[GCC 4.8.2]
sys.platform    linux2
sys.maxsize     9223372036854775807
numpy   1.8.2

错误消息如下:

Traceback (most recent call last):
  File "HDF_write_dimScales.py", line 16
    for ii in h5f['data_1'].attrs.items():
  File "/usr/lib/python2.7/dist-packages/h5py/_hl/base.py", line 347, in items
    return [(x, self.get(x)) for x in self]
  File "/usr/lib/python2.7/dist-packages/h5py/_hl/base.py", line 310, in get
    return self[name]
  File "/usr/lib/python2.7/dist-packages/h5py/_hl/attrs.py", line 55, in __getitem__
    rtdt = readtime_dtype(attr.dtype, [])
  File "h5a.pyx", line 318, in h5py.h5a.AttrID.dtype.__get__ (h5py/h5a.c:4285)
  File "h5t.pyx", line 337, in h5py.h5t.TypeID.py_dtype (h5py/h5t.c:3892)
TypeError: No NumPy equivalent for TypeVlenID exists

感谢任何想法或提示。

2 个答案:

答案 0 :(得分:1)

它适用于我h5py 2.5.0的一些轻微调整。问题可能与您致电create_scale时有关。使用h5py 2.5.0,我在KeyError电话中获得h5f['x_axis'] create_scale()。为了让您的示例正常工作,我必须先显式创建x_axis数据集。

import h5py
import numpy as np

# create data and x-axis
my_data = np.random.randint(10, size=(100, 200))

# Use a context manager to ensure h5f is closed
with h5py.File('my_file.h5','w') as h5f:
    h5f.create_dataset( 'data_1', data=my_data )

    # Create the x_axis dataset directly in the HDF5 file
    h5f['x_axis']  = np.linspace(0, 1, 100)

    h5f['data_1'].dims[0].label = 'm'

    # Now we can create and attach the scale without problems
    h5f['data_1'].dims.create_scale( h5f['x_axis'], 'x' )
    h5f['data_1'].dims[0].attach_scale( h5f['x_axis'] )

    for ii in h5f['data_1'].attrs.items():
        print(ii)

# Output
#(u'DIMENSION_LABELS', array(['m', ''], dtype=object))
#(u'DIMENSION_LIST', array([array([<HDF5 object reference>], dtype=object),
#       array([], dtype=object)], dtype=object))

如果您仍然遇到问题,可能需要升级到h5py 2.5.0,这样可以更好地处理VLEN类型(尽管仍然不完美)。

答案 1 :(得分:0)

这只是一个猜测,但由于错误引用TypeVlenID,它可能与vlenh5py的不完整实现有关(特别是在我们的模块版本中) )。

Inexplicable behavior when using vlen with h5py

Writing to compound dataset with variable length string via h5py (HDF5)