如何使用H5py在python 3中正确打开,读取和保存到单个文件

时间:2017-10-09 09:25:03

标签: python numpy hdf5 h5py

我是蟒蛇和编程的新手,我可能犯了可怕的错误。感谢您的任何帮助。我想通过加载由其他人准备的一些hdf5数据或加载我自己的hdf5文件来初始化我的类的成员。我试过这个:

import numpy as np
import h5py
import sys

class ashot:
    def __init__(self, path, load=False):
        if load is False:
            self.name = "_".join(re.findall(r"(\d+)_(\d+)/aa/shot_(\d+)", path)[0])
            f = h5py.File(path, "r")
            numpyarray = f["data/data"]
            self.array = numpyarray
        else:
            f = h5py.File(path, "a")
            self.array = f["array"]
            self.name = f["array"].attrs["name"]

    def saveshot(self):
        s = h5py.File(self.name+".h5", "a")
        s.create_dataset("array", data=self.array)
        s["array"].attrs["name"] = self.name
        s.close()
        return()

但如果我使用以下方式运行它:

testshot = ashot("somepath to data storage")
testshot.saveshot()
loadshot = ashot("the path I stored the shot testshot", load = True)
loadshot.saveshot()

我得到了

Traceback (most recent call last):
File "program path.py", line 191, in <module>
loadshot.saveshot()
File "program path.py", line 114, in saveshot
s.create_dataset("array", data=self.array)
File "C:\Users\Drossel\AppData\Local\Programs\Python\Python36\lib\site-packages\h5py\_hl\group.py", line 109, in create_dataset
self[name] = dset
File "C:\Users\Drossel\AppData\Local\Programs\Python\Python36\lib\site-packages\h5py\_hl\group.py", line 277, in __setitem__
h5o.link(obj.id, self.id, name, lcpl=lcpl, lapl=self._lapl
File "h5py\_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py\_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py\h5o.pyx", line 202, in h5py.h5o.link
RuntimeError: Unable to create link (name already exists)

我有点得到我正在尝试写入已经打开的文件,但同样的代码uumping numpy.save和numpy.load因某些原因而起作用。我在帮助self.array之后尝试关闭文件,但后来我得到了

NameError: name 'ashot' is not defined

因为我假设,f在那一点上只是一个文件句柄。我究竟做错了什么?

1 个答案:

答案 0 :(得分:1)

不允许两次创建数据集:

In [34]: F = h5py.File('testh546643026.h5','a')
In [35]: ds = F.create_dataset('tst',data=np.arange(3))
In [36]: F.close()
In [37]: F = h5py.File('testh546643026.h5','a')
In [38]: ds = F.create_dataset('tst',data=np.arange(3))
....
RuntimeError: Unable to create link (Name already exists)

require可以获取现有数据集(或创建新数据集),但shape和dtype必须匹配(请参阅其文档):

In [41]: ds = F.require_dataset('tst',(3,),int)
In [42]: ds
Out[42]: <HDF5 dataset "tst": shape (3,), type "<i4">
In [43]: ds.value
Out[43]: array([0, 1, 2])
In [44]: ds[:]=np.ones((3,))
In [45]: ds.value
Out[45]: array([1, 1, 1])

如果要自由替换现有数据集,则必须先删除它。

How to edit h5 files with h5py?