HDF5读数和fit_generator多处理错误

时间:2018-06-17 09:09:31

标签: multithreading tensorflow parallel-processing keras deep-learning

我正在尝试对fit_generator进行多处理。

这些是我面临的问题。

trainable_model.fit_generator(load_random_cached_bottlenecks(BATCH_SIZE, label_map, training_addr_label_map, train_npy_dir, 'h5py', h5py_file_train),epochs = EPOCHS, steps_per_epoch=iterations_per_epoch_t, validation_data = load_random_cached_bottlenecks(BATCH_SIZE, label_map, validation_addr_label_map, val_npy_dir, 'h5py', h5py_file_val), validation_steps=iterations_per_epoch_v, workers = 1, callbacks = callback_list, use_multiprocessing = True, max_queue_size = 32)

导致问题的主要论据:workersuse_multiprocessing

worker=1时,use_multiprocessing=True/False无问题地运行。

如果workers=5use_multiprocessing=True投掷错误。奇怪的是它的运行,但在一些随机迭代中,我遇到了像

这样的错误
KeyError: 'Unable to open object (bad local heap signature)'

KeyError: 'Unable to open object (wrong B-tree signature)'

我使用h5py来读取文件。我为此目的编写了自定义生成器。

def load_random_cached_bottlenecks(batch_size, label_map,
 addr_label_map, dirs, comp_type = 'h5py', hdf5_file = None):
'''
Parameters
----------
batch_size: Number of bottlenecks to be loaded along with the labels
label_map: The dictionary that maps the class_names and the index
addr_label_map: The dictionary that maps addrs of bottlenecks and the labels
hdf5_file: This is the hdf5 file object with reading enabled.
Returns
-------
batch: (bottlenecks_train, bottlenecks_labels) a batch of them which is equal to batch_size
'''
while True:
    chosen_h5py = np.random.choice(dirs, size = batch_size)
    # chosen_h5py = [dirs[i] for i in batch_index]
    labels_for_chosen_h5py = [label_map[addr_label_map[i]] for i in chosen_h5py]
    h5py_data = np.array([hdf5_file[i] for i in chosen_h5py])
    h5py_onehot = to_categorical(labels_for_chosen_h5py, num_classes = LABEL_LENGTH)
    # print (h5py_data.shape)
    yield (h5py_data, h5py_onehot)

我已提到here,但无法解决我的问题。

Traceback (most recent call last):
File "/opt/anaconda3/lib/python3.6/site-packages/keras/utils/data_utils.py", line 677, in _data_generator_task
generator_output = next(self._generator)
File "general_model.py", line 263, in load_random_cached_bottlenecks
h5py_data = np.array([hdf5_file[i] for i in chosen_h5py])
File "/opt/anaconda3/lib/python3.6/site-packages/keras/utils/data_utils.py", line 677, in _data_generator_task
generator_output = next(self._generator)
File "general_model.py", line 263, in load_random_cached_bottlenecks
h5py_data = np.array([hdf5_file[i] for i in chosen_h5py])
File "general_model.py", line 263, in <listcomp>
h5py_data = np.array([hdf5_file[i] for i in chosen_h5py])
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "general_model.py", line 263, in <listcomp>
h5py_data = np.array([hdf5_file[i] for i in chosen_h5py])
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "/opt/anaconda3/lib/python3.6/site-packages/h5py/_hl/group.py", line 177, in __getitem__
oid = h5o.open(self.id, self._e(name), lapl=self._lapl)
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py/h5o.pyx", line 190, in h5py.h5o.open
File "/opt/anaconda3/lib/python3.6/site-packages/h5py/_hl/group.py", line 177, in __getitem__
oid = h5o.open(self.id, self._e(name), lapl=self._lapl)
KeyError: 'Unable to open object (wrong B-tree signature)'
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py/h5o.pyx", line 190, in h5py.h5o.open
KeyError: 'Unable to open object (bad symbol table node signature)'
Traceback (most recent call last):
File "general_model.py", line 437, in <module>
train_with_bottlenecks(args, label_map, trainable_model, non_trainable_model, iterations_per_epoch_t, iterations_per_epoch_v)
File "general_model.py", line 326, in train_with_bottlenecks
validation_steps=iterations_per_epoch_v, workers = 4, callbacks = callback_list, use_multiprocessing = True, max_queue_size = 32)
File "/opt/anaconda3/lib/python3.6/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "/opt/anaconda3/lib/python3.6/site-packages/keras/engine/training.py", line 2194, in fit_generator
generator_output = next(output_generator)
File "/opt/anaconda3/lib/python3.6/site-packages/keras/utils/data_utils.py", line 793, in get
six.reraise(value.__class__, value, value.__traceback__)
File "/opt/anaconda3/lib/python3.6/site-packages/six.py", line 693, in reraise
raise value
   KeyError: 'Unable to open object (wrong B-tree signature)'

任何帮助表示赞赏!在此先感谢!

1 个答案:

答案 0 :(得分:0)

这本身不是一个解决方案,但这为我解决了这个问题。

我遇到类似的错误:尝试使用OSError: Can't read data (wrong B-tree signature) 时,fit_generator也在hdf5_file虚拟环境中从anaconda3读取数据。

在我的情况下,我创建了一个新的虚拟环境,并重新安装了它应该在其中运行的特定版本的必需依赖项,与此同时,我的代码也运行顺利。