我有两个ipynb文件:train.ipynb
和predict.ipynb
。我在train.ipynb
中使用fit生成器(批处理大小为64)训练了一个模型,并尝试在ResourceExhaustedError
中加载权重时捕获了predict.ipynb
我在tensorflow v1.9和tensorflow docker映像中使用了keras。
# train.ipynb
def network():
#[ A normal model]
return model
model = network()
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit_generator(seq,shuffle=True,
epochs = 10, verbose=1
)
# save the model and weight after training
with open('model.json','w') as json_file:
json_file.write(model.to_json())
model.save_weights('model.h5')
clear_session() # tried to clear the session here
# saved both successfully
# model.h5(131MB)
成功保存后,我可以将其加载回train.ipynb
中。但是,当我在predict.ipynb
中执行相同的操作时,会捕获错误。
# train.ipynb
with open('model.json','r') as json_file:
test_model = model_from_json(json_file.read())
test_model.load_weights('model.h5')
# No error here
# predict.ipynb
with open('model.json','r') as json_file:
test_model = model_from_json(json_file.read())
test_model.load_weights('model.h5')
# Got the following error
ResourceExhaustedError: OOM when allocating tensor with shape[28224,1024] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
感谢您的帮助!
答案 0 :(得分:0)
您是否同时运行两个笔记本?您的GPU内存不足。尽管请注意,默认情况下TensorFlow会占用所有可用的GPU内存,但请在命令行中尝试nvidia-smi
来检查GPU的资源使用情况。 keras.backend.clear_session()
也可能有帮助。