我正在尝试为图像创建卷积神经网络。我目前有大约136张图像(以后还会添加更多图片),用于17个课程。
每个图像都以numpy.array
形状的(330, 330, 3)
形式出现。
我将以下代码用于网络:
batch_size = 64
nb_classes = 17
nb_epoch = 2
img_rows = 330
img_cols = 330
nb_filters = 16
nb_conv = 3 # convolution kernel size
nb_pool = 2
model = Sequential()
# 1st conv layer:
model.add(Convolution2D(
nb_filters, (nb_conv, nb_conv),
padding="valid",
input_shape=(img_rows, img_cols, 3),
data_format='channels_last', ))
model.add(Activation('relu'))
# 2nd conv layer:
model.add(Convolution2D(nb_filters, (nb_conv, nb_conv), data_format='channels_last'))
model.add(Activation('relu'))
# maxpooling layer:
model.add(MaxPooling2D(pool_size=(nb_pool, nb_pool), data_format="channels_last"))
model.add(Dropout(0.25))
# 2 FC layers:
model.add(Flatten())
model.add(Dense(128))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(nb_classes))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adadelta')
model.summary()
model.fit(X_train, y_train, batch_size=batch_size, epochs=nb_epoch, verbose=1 )
但是,它会在第一个纪元本身开始后很快发出一条消息,指出“已使用> 10%系统内存”。它变得没有反应,我必须对其进行硬重启。
我可以采取哪些步骤或对代码进行更改以减少内存需求?
答案 0 :(得分:2)
通过查看model.summary()
的输出,您将找出导致此问题的原因(即哪些层的参数过多):
Layer (type) Output Shape Param #
=================================================================
conv2d_189 (Conv2D) (None, 328, 328, 16) 448
_________________________________________________________________
activation_189 (Activation) (None, 328, 328, 16) 0
_________________________________________________________________
conv2d_190 (Conv2D) (None, 326, 326, 16) 2320
_________________________________________________________________
activation_190 (Activation) (None, 326, 326, 16) 0
_________________________________________________________________
max_pooling2d_9 (MaxPooling2 (None, 163, 163, 16) 0
_________________________________________________________________
dropout_3 (Dropout) (None, 163, 163, 16) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 425104) 0
_________________________________________________________________
dense_5 (Dense) (None, 128) 54413440
_________________________________________________________________
activation_191 (Activation) (None, 128) 0
_________________________________________________________________
dropout_4 (Dropout) (None, 128) 0
_________________________________________________________________
dense_6 (Dense) (None, 17) 2193
_________________________________________________________________
activation_192 (Activation) (None, 17) 0
=================================================================
Total params: 54,418,401
Trainable params: 54,418,401
Non-trainable params: 0
_________________________________________________________________
如您所见,由于Flatten
层的输出非常大,因此Dense
层将有太多参数:425104 * 128 + 128 = 54413440
,即仅一层就有5400万个参数(这几乎是模型中所有参数的99%)。那么,如何减少这个数字呢?您需要通过使用stride
参数(不建议使用)或池化层(最好在每个conv层之后)来减小卷积层的输出大小。我们再添加两个池化层和一个conv层(随着深入,我什至增加了conv层中的过滤器数量,因为这样做通常是一件好事):
# 1st conv + pooling layer:
model.add(Convolution2D(
nb_filters, (nb_conv, nb_conv),
padding="valid",
input_shape=(img_rows, img_cols, 3),
data_format='channels_last', ))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(nb_pool, nb_pool), data_format="channels_last"))
# 2nd conv + pooling layer:
model.add(Convolution2D(nb_filters*2, (nb_conv, nb_conv), data_format='channels_last'))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(nb_pool, nb_pool), data_format="channels_last"))
# 3rd conv + pooling layer:
model.add(Convolution2D(nb_filters*2, (nb_conv, nb_conv), data_format='channels_last'))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(nb_pool, nb_pool), data_format="channels_last"))
# the rest is the same...
模型摘要输出:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_197 (Conv2D) (None, 328, 328, 16) 448
_________________________________________________________________
activation_203 (Activation) (None, 328, 328, 16) 0
_________________________________________________________________
max_pooling2d_16 (MaxPooling (None, 164, 164, 16) 0
_________________________________________________________________
conv2d_198 (Conv2D) (None, 162, 162, 32) 4640
_________________________________________________________________
activation_204 (Activation) (None, 162, 162, 32) 0
_________________________________________________________________
max_pooling2d_17 (MaxPooling (None, 81, 81, 32) 0
_________________________________________________________________
conv2d_199 (Conv2D) (None, 79, 79, 32) 9248
_________________________________________________________________
activation_205 (Activation) (None, 79, 79, 32) 0
_________________________________________________________________
max_pooling2d_18 (MaxPooling (None, 39, 39, 32) 0
_________________________________________________________________
dropout_9 (Dropout) (None, 39, 39, 32) 0
_________________________________________________________________
flatten_4 (Flatten) (None, 48672) 0
_________________________________________________________________
dense_11 (Dense) (None, 128) 6230144
_________________________________________________________________
activation_206 (Activation) (None, 128) 0
_________________________________________________________________
dropout_10 (Dropout) (None, 128) 0
_________________________________________________________________
dense_12 (Dense) (None, 17) 2193
_________________________________________________________________
activation_207 (Activation) (None, 17) 0
=================================================================
Total params: 6,246,673
Trainable params: 6,246,673
Non-trainable params: 0
_________________________________________________________________
如您所见,现在它的参数少于650万,几乎是以前模型中参数数量的十分之一。您甚至可以添加另一个池化层以进一步减少参数数量。但是,请记住,随着您的模型越来越深入(即,层数越来越多),您可能需要注意诸如vanishing gradient和overfitting之类的问题。
答案 1 :(得分:1)
除了缩小输入图像的比例之外,您唯一能做的就是减小批量大小,直到工作为止。
还可以在网络中进行更多的池化(这不是很深),因为这样,密集层的参数就更少了。
答案 2 :(得分:1)
“您需要更深入”(c)=)
经过2个卷积/合并层后,您仍然具有80x80的图像,在展平后,该图像将变成高达6400的大密集层。如果只有17个类,则需要更深入,添加更多的卷积和池化,以便图像变为大约20x20(2个额外的conv / maxpool),然后网络将运行得更好,并且需要较少的内存用于密集层。 / p>