运行keras版本的mrcnn的形状训练样本时出现内存问题

时间:2018-07-13 01:51:21

标签: python tensorflow keras conv-neural-network

我尝试在keras版本的mrcnn的形状样本中运行火车代码,这是github链接:https://github.com/matterport/Mask_RCNN/blob/master/samples/shapes/train_shapes.ipynb 我尝试了更大的内存并减少了每个时期的图像数量,但是没有用,所以这不是内存问题。 这是代码

    model.train(dataset_train, dataset_val, 
                learning_rate=config.LEARNING_RATE, 
                epochs=1, 
                layers='heads')

这是错误:

    OSError                                   Traceback (most recent call last)
    <ipython-input-15-83fb3ae74319> in <module>()
          6             learning_rate=config.LEARNING_RATE,
          7             epochs=1,
    ----> 8             layers='heads')

    ~/RCNN/MRCNN/Mask_RCNN/mrcnn/model.py in train(self, train_dataset, val_dataset, learning_rate, epochs, layers, augmentation)
       2350             max_queue_size=100,
       2351             workers=workers,
    -> 2352             use_multiprocessing=True,
       2353         )
       2354         self.epoch = max(self.epoch, epochs)

    ~/.conda/envs/21/lib/python3.6/site-packages/keras/legacy/interfaces.py in wrapper(*args, **kwargs)
         89                 warnings.warn('Update your `' + object_name +
         90                               '` call to the Keras 2 API: ' + signature, stacklevel=2)
    ---> 91             return func(*args, **kwargs)
         92         wrapper._original_function = func
         93         return wrapper

    ~/.conda/envs/21/lib/python3.6/site-packages/keras/engine/training.py in fit_generator(self, generator, steps_per_epoch, epochs, verbose, callbacks, validation_data, validation_steps, class_weight, max_queue_size, workers, use_multiprocessing, shuffle, initial_epoch)
       1424             use_multiprocessing=use_multiprocessing,
       1425             shuffle=shuffle,
    -> 1426             initial_epoch=initial_epoch)
       1427 
       1428     @interfaces.legacy_generator_methods_support

    ~/.conda/envs/21/lib/python3.6/site-packages/keras/engine/training_generator.py in fit_generator(model, generator, steps_per_epoch, epochs, verbose, callbacks, validation_data, validation_steps, class_weight, max_queue_size, workers, use_multiprocessing, shuffle, initial_epoch)
        135                     use_multiprocessing=use_multiprocessing,
        136                     wait_time=wait_time)
    --> 137             enqueuer.start(workers=workers, max_queue_size=max_queue_size)
        138             output_generator = enqueuer.get()
        139         else:

    ~/.conda/envs/21/lib/python3.6/site-packages/keras/utils/data_utils.py in start(self, workers, max_queue_size)
        724                     thread = threading.Thread(target=self._data_generator_task)
        725                 self._threads.append(thread)
    --> 726                 thread.start()
        727         except:
        728             self.stop()

    ~/.conda/envs/21/lib/python3.6/multiprocessing/process.py in start(self)
        103                'daemonic processes are not allowed to have children'
        104         _cleanup()
    --> 105         self._popen = self._Popen(self)
        106         self._sentinel = self._popen.sentinel
        107         # Avoid a refcycle if the target function holds an indirect

    ~/.conda/envs/21/lib/python3.6/multiprocessing/context.py in _Popen(process_obj)
        221     @staticmethod
        222     def _Popen(process_obj):
    --> 223         return _default_context.get_context().Process._Popen(process_obj)
        224 
        225 class DefaultContext(BaseContext):

    ~/.conda/envs/21/lib/python3.6/multiprocessing/context.py in _Popen(process_obj)
        275         def _Popen(process_obj):
        276             from .popen_fork import Popen
    --> 277             return Popen(process_obj)
        278 
        279     class SpawnProcess(process.BaseProcess):

    ~/.conda/envs/21/lib/python3.6/multiprocessing/popen_fork.py in __init__(self, process_obj)
         17         util._flush_std_streams()
         18         self.returncode = None
    ---> 19         self._launch(process_obj)
         20 
         21     def duplicate_for_child(self, fd):

    ~/.conda/envs/21/lib/python3.6/multiprocessing/popen_fork.py in _launch(self, process_obj)
         64         code = 1
         65         parent_r, child_w = os.pipe()
    ---> 66         self.pid = os.fork()
         67         if self.pid == 0:
         68             try:

    OSError: [Errno 12] Cannot allocate memory

我将use_multiprocessing更改为False仍然无法正常工作。有人知道为什么会发生此错误吗? 谢谢!

1 个答案:

答案 0 :(得分:0)

堆栈跟踪是从启用了多处理的运行中得出的,因此它实际上并不能提供有关其他问题的信息。

减少max_queue_size可能有效。当 GPU 内存不足时,通常建议减小批量大小。但是您的常规RAM用完了。