我尝试建立Keras预训练的MobileNet模型, 遵循https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html
上Keras博客上的指南这是我的代码:
preTrainedModel = MobileNetV2(weights = 'imagenet', include_top = False)
preFeatures = preTrainedModel.output
preFeatures = GlobalAveragePooling2D()(preFeatures)
preFeatures = Dense(1024, activation = 'relu')(preFeatures)
predictions = Dense(10, activation = 'softmax')(preFeatures)
#Extract features
model = Model(input = preTrainedModel.input, output = predictions)
#Layer freezing
for layer in preTrainedModel.layers:
layer.trainable = False
if os.path.exists(top_layers_checkpoint_path):
model.load_weights(top_layers_checkpoint_path)
print ("Checkpoint '" + top_layers_checkpoint_path + "' loaded.")
#Rmsprop optimizer
model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'], )
#Save the model after every epoch
mc_top = ModelCheckpoint(top_layers_checkpoint_path, monitor='val_acc', verbose=0, save_best_only=True, save_weights_only=False, mode='auto', period=1)
#Save the TensorBoard logs.
tb = TensorBoard(log_dir='./logs', histogram_freq=1, write_graph=True, write_images=True)
model.fit_generator(datGen, samples_per_epoch = batchPerEpoch, nb_epoch = epochPerPass, validation_data = validateDataFlow, nb_val_samples = batchPerEpoch, callbacks = [mc_top, tb], use_multiprocessing=False)
for i, layer in enumerate(preTrainedModel.layers):
print(i, layer.name)
#Save the model after every epoch.
mc_fit = ModelCheckpoint(fine_tuned_checkpoint_path, monitor='val_acc', verbose=0, save_best_only=True, save_weights_only=False, mode='auto', period=1)
if os.path.exists(fine_tuned_checkpoint_path):
model.load_weights(fine_tuned_checkpoint_path)
print ("Checkpoint '" + fine_tuned_checkpoint_path + "' loaded.")
# we chose to train the top 2 inception blocks, i.e. we will freeze
# the first 172 layers and unfreeze the rest:
for layer in model.layers[:50]:
layer.trainable = False
for layer in model.layers[50:]:
layer.trainable = True
model.fit_generator(datGen, samples_per_epoch = batchPerEpoch, nb_epoch = epochPerPass, validation_data = validateDataFlow, nb_val_samples = batchPerEpoch, callbacks = [mc_top, tb], use_multiprocessing=False)
但是它报告了问题:
tensorflow.python.framework.errors_impl.ResourceExhaustedError:OOM 在分配具有张量[1373,32,112,112]的张量并在类型上浮动时 / job:本地主机/副本:0 /任务:0 /设备:GPU:0通过分配器GPU_0_bfc [[{{node Conv1 / convolution}} = Conv2D [T = DT_FLOAT,_class = [“ loc:@ bn_Conv1 / cond / FusedBatchNorm / Switch”],data_format =“ NCHW”,膨胀= [1、1、1、1、1 ],padding =“ VALID”, 步幅= [1,1,2,2],use_cudnn_on_gpu = true, _device =“ / job:localhost / replica:0 / task:0 / device:GPU:0”](Conv1 / convolution-0-TransposeNHWCToNCHW-LayoutOptimizer, Conv1 /内核/读取)]]
这是否意味着GPU资源已耗尽?我该如何解决该问题?