我有一个包含许多不同组的HDF5文件,所有这些组都有相同的行数。我还有一个布尔掩码,用于保留或删除行。我想迭代HDF5文件中的所有组,根据掩码删除行。
递归访问所有群组的recommended method为visit(callable)
,但我无法确定如何将我的面具传递给可调用者。
以下是一些代码,希望能够证明我想做什么,但哪些不起作用:
def apply_mask(name, *args):
h5obj[name] = h5obj[name][mask]
with h5py.File(os.path.join(directory, filename), 'r+') as h5obj:
h5obj.visit(apply_mask, mask)
导致错误
TypeError: visit() takes 2 positional arguments but 3 were given
如何将我的蒙版数组放入此函数?
答案 0 :(得分:1)
我最终通过一系列hacky解决方法实现了这一目标。如果有更好的解决方案,我有兴趣了解它!
with h5py.File(os.path.join(directory, filename), 'r+') as h5obj:
# Use the visit callable to append to a list of key names
h5_keys = []
h5obj.visit(h5_keys.append)
# Then loop over those keys and, if they're datasets rather than
# groups, remove the invalid rows
for h5_key in h5_keys:
if isinstance(h5obj[h5_key], h5py.Dataset):
tmp = np.array(h5obj[h5_key])[mask]
# There is no way to simply change the dataset because its
# shape is fixed, causing a broadcast error, so it is
# necessary to delete and then recreate it.
del h5obj[h5_key]
h5obj.create_dataset(h5_key, data=tmp)