tf.keras.Model.load_weights()捕获了ResourceExhaustedError

时间:2018-07-24 09:08:21

标签: python-3.x tensorflow keras resources

我有两个ipynb文件:train.ipynbpredict.ipynb。我在train.ipynb中使用fit生成器(批处理大小为64)训练了一个模型,并尝试在ResourceExhaustedError中加载权重时捕获了predict.ipynb 我在tensorflow v1.9和tensorflow docker映像中使用了keras。

# train.ipynb

def network():
    #[ A normal model]
    return model
model = network()
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit_generator(seq,shuffle=True,
                    epochs = 10, verbose=1
                   )
# save the model and weight after training
with open('model.json','w') as json_file:
    json_file.write(model.to_json())
model.save_weights('model.h5')
clear_session() # tried to clear the session here
# saved both successfully
# model.h5(131MB)

成功保存后,我可以将其加载回train.ipynb中。但是,当我在predict.ipynb中执行相同的操作时,会捕获错误。

# train.ipynb
with open('model.json','r') as json_file:
    test_model = model_from_json(json_file.read())
test_model.load_weights('model.h5')
# No error here

# predict.ipynb
with open('model.json','r') as json_file:
    test_model = model_from_json(json_file.read())
test_model.load_weights('model.h5')
# Got the following error
ResourceExhaustedError: OOM when allocating tensor with shape[28224,1024] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc

感谢您的帮助!

1 个答案:

答案 0 :(得分:0)

您是否同时运行两个笔记本?您的GPU内存不足。尽管请注意,默认情况下TensorFlow会占用所有可用的GPU内存,但请在命令行中尝试nvidia-smi来检查GPU的资源使用情况。 keras.backend.clear_session()也可能有帮助。