Question

我想将深度学习应用于我的分类问题，其中我的数据集中的灰度图像的大小为200x200。目前，我正在对我的大型数据集（超过15,000张图像）的一个非常小的子集（152张图像）测试DL;我正在使用Keras（版本'1.1.2'）库与Theano（版本'0.9.0.dev4'）后端Python（Python 2.7.12 :: Anaconda 4.2.0（64位））。我的代码在CPU中运行，但速度非常慢。所以，我切换到GPU。但是，我收到了以下错误：

Using Theano backend.
Using gpu device 0: GeForce GTS 450 (CNMeM is enabled with initial size: 70.0% of memory, cuDNN not available)

Train on 121 samples, validate on 31 samples
Epoch 1/200
Traceback (most recent call last):

  File "<ipython-input-6-247bada3ec1a>", line 2, in <module>
    verbose=1, validation_data=(X_test, Y_test))

  File "/home/user1/anaconda2/envs/keras_env/lib/python2.7/site-packages/keras/models.py", line 652, in fit
    sample_weight=sample_weight)

  File "/home/user1/anaconda2/envs/keras_env/lib/python2.7/site-packages/keras/engine/training.py", line 1111, in fit
    initial_epoch=initial_epoch)

  File "/home/user1/anaconda2/envs/keras_env/lib/python2.7/site-packages/keras/engine/training.py", line 826, in _fit_loop
    outs = f(ins_batch)

  File "/home/user1/anaconda2/envs/keras_env/lib/python2.7/site-packages/keras/backend/theano_backend.py", line 811, in __call__
    return self.function(*inputs)

  File "/home/user1/anaconda2/envs/keras_env/lib/python2.7/site-packages/theano/compile/function_module.py", line 886, in __call__
    storage_map=getattr(self.fn, 'storage_map', None))

  File "/home/user1/anaconda2/envs/keras_env/lib/python2.7/site-packages/theano/gof/link.py", line 325, in raise_with_op
    reraise(exc_type, exc_value, exc_trace)

  File "/home/user1/anaconda2/envs/keras_env/lib/python2.7/site-packages/theano/compile/function_module.py", line 873, in __call__
    self.fn() if output_subset is None else\

MemoryError: Error allocating 160579584 bytes of device memory (CNMEM_STATUS_OUT_OF_MEMORY).
Apply node that caused the error: GpuElemwise{Composite{(i0 * (i1 + Abs(i1)))},no_inplace}(CudaNdarrayConstant{[[[[ 0.5]]]]}, GpuElemwise{Add}[(0, 0)].0)
Toposort index: 60
Inputs types: [CudaNdarrayType(float32, (True, True, True, True)), CudaNdarrayType(float32, 4D)]
Inputs shapes: [(1, 1, 1, 1), (32, 32, 198, 198)]
Inputs strides: [(0, 0, 0, 0), (1254528, 39204, 198, 1)]
Inputs values: [CudaNdarray([[[[ 0.5]]]]), 'not shown']
Outputs clients: [[GpuContiguous(GpuElemwise{Composite{(i0 * (i1 + Abs(i1)))},no_inplace}.0)]]

HINT: Re-running with most Theano optimization disabled could give you a back-trace of when this node was created. This can be done with by setting the Theano flag 'optimizer=fast_compile'. If that does not work, Theano optimizations can be disabled with 'optimizer=None'.
HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.

我尝试了建议的解决方案（optimizer=fast_compile和optimizer=None）但是没有成功。我知道问题与图像大小有关，因为当我将图像大小调整为50x50时它会起作用。

您知道如何解决问题，以便能够将其应用于200x200张图片吗？

我使用的是Linux Mageia 5，我的GPU信息是：

02:00.0 VGA compatible controller: NVIDIA Corporation GF106 [GeForce GTS 450] (rev a1)
[    64.299] (--) NVIDIA(0): Memory: 1048576 kBytes
[    64.313] (II) NVIDIA: Using 12288.00 MB of virtual memory for indirect memory
[    64.439] (==) NVIDIA(0): Disabling shared memory pixmaps

我不确定使用cuDNN是否是解决我问题的正确方法，但我已经尝试通过在optimizer_including=cudnn中加入.theanorc来使用它。但我收到了以下错误：

AssertionError: cuDNN optimization was enabled, but Theano was not able to use it. We got this error: 
Device not supported

我认为这可能是因为我的GPU计算兼容性为2.1（低于cudnn GPU cc要求（3.0或更高））。

如果您能帮助我解决问题并在我的大型数据集上运行DL，我将感激不尽？

Answer 1

它说你的GPU内存不足。因此，更改批量大小并且不要使用共享变量直接将所有数据加载到GPU，而是在它们之间进行迭代。否则找到另一个具有更高内存容量的GPU

在GPU上使用Keras的内存不足

1 个答案: