Question

我有两个ipynb文件：train.ipynb和predict.ipynb。我在train.ipynb中使用fit生成器（批处理大小为64）训练了一个模型，并尝试在ResourceExhaustedError中加载权重时捕获了predict.ipynb 我在tensorflow v1.9和tensorflow docker映像中使用了keras。

# train.ipynb

def network():
    #[ A normal model]
    return model
model = network()
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit_generator(seq,shuffle=True,
                    epochs = 10, verbose=1
                   )
# save the model and weight after training
with open('model.json','w') as json_file:
    json_file.write(model.to_json())
model.save_weights('model.h5')
clear_session() # tried to clear the session here
# saved both successfully
# model.h5(131MB)

成功保存后，我可以将其加载回train.ipynb中。但是，当我在predict.ipynb中执行相同的操作时，会捕获错误。

# train.ipynb
with open('model.json','r') as json_file:
    test_model = model_from_json(json_file.read())
test_model.load_weights('model.h5')
# No error here

# predict.ipynb
with open('model.json','r') as json_file:
    test_model = model_from_json(json_file.read())
test_model.load_weights('model.h5')
# Got the following error
ResourceExhaustedError: OOM when allocating tensor with shape[28224,1024] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc

感谢您的帮助！

Answer 1

您是否同时运行两个笔记本？您的GPU内存不足。尽管请注意，默认情况下TensorFlow会占用所有可用的GPU内存，但请在命令行中尝试nvidia-smi来检查GPU的资源使用情况。 keras.backend.clear_session()也可能有帮助。

tf.keras.Model.load_weights（）捕获了ResourceExhaustedError

1 个答案: