Tensorflow上VGG16的内存问题

时间:2017-06-16 16:04:45

标签: python amazon-web-services machine-learning tensorflow keras

我一直在尝试使用带有Tensorflow后端的VGG16 Keras模型,以便为Kaggle上的'Planet: Understanding the Amazon from Space竞赛分类图像。不幸的是,当试图让模型运行时,我一直遇到内存问题,即使在AWS的g.2.8大型内存上运行时也会遇到60 GB的内存。

问题的追溯如下:

    Layer (type)                 Output Shape              Param #
=================================================================
input_1 (InputLayer)         (None, 224, 224, 3)       0
_________________________________________________________________
block1_conv1 (Conv2D)        (None, 224, 224, 64)      1792
_________________________________________________________________
block1_conv2 (Conv2D)        (None, 224, 224, 64)      36928
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, 112, 112, 64)      0
_________________________________________________________________
block2_conv1 (Conv2D)        (None, 112, 112, 128)     73856
_________________________________________________________________
block2_conv2 (Conv2D)        (None, 112, 112, 128)     147584
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, 56, 56, 128)       0
_________________________________________________________________
block3_conv1 (Conv2D)        (None, 56, 56, 256)       295168
_________________________________________________________________
block3_conv2 (Conv2D)        (None, 56, 56, 256)       590080
_________________________________________________________________
block3_conv3 (Conv2D)        (None, 56, 56, 256)       590080
_________________________________________________________________
block3_pool (MaxPooling2D)   (None, 28, 28, 256)       0
_________________________________________________________________
block4_conv1 (Conv2D)        (None, 28, 28, 512)       1180160
_________________________________________________________________
block4_conv2 (Conv2D)        (None, 28, 28, 512)       2359808
_________________________________________________________________
block4_conv3 (Conv2D)        (None, 28, 28, 512)       2359808
_________________________________________________________________
block4_pool (MaxPooling2D)   (None, 14, 14, 512)       0
_________________________________________________________________
block5_conv1 (Conv2D)        (None, 14, 14, 512)       2359808
_________________________________________________________________
block5_conv2 (Conv2D)        (None, 14, 14, 512)       2359808
_________________________________________________________________
block5_conv3 (Conv2D)        (None, 14, 14, 512)       2359808
_________________________________________________________________
block5_pool (MaxPooling2D)   (None, 7, 7, 512)         0
_________________________________________________________________
sequential_1 (Sequential)    (None, 1)                 6423041
=================================================================
Total params: 21,137,729.0
Trainable params: 21,137,729.0
Non-trainable params: 0.0
_________________________________________________________________
Traceback (most recent call last):
  File "VGG16_Kg_Kernel.py", line 160, in <module>
        train_datagen.fit(x_train)
      File "/home/ec2-user/src/anaconda3/lib/python3.5/site-packages/keras/preprocessing/image.py", line 648, in fit
        x = np.copy(x)
      File "/home/ec2-user/src/anaconda3/lib/python3.5/site-packages/numpy/lib/function_base.py", line 1497, in copy
        return array(a, order=order, copy=True)
    MemoryError

可在此处找到整个打印件:https://github.com/j-v-k/VGG16/blob/master/error_text.txt

从打印输出来看,GPU似乎正在运行,但它可能无法正常运行。

数据包含~100K 11.6 KB图像。我可以在此处找到用于运行模型的代码:https://github.com/j-v-k/VGG16/blob/master/VGG16_Kg_Kernel.py

如果需要更多信息,请告诉我。谢谢!

1 个答案:

答案 0 :(得分:0)

缺点是你没有GPU内存而不是RAM。 G2拥有4GB图形处理器,VGG16和Tensorflow似乎存在问题。

我与Theano后端运行相同并且没有任何问题。我建议尝试一下。