我目前正在将导出的Keras模型导入Tensorflow。代码在顺序模型中运行良好。我能够在python中训练模型,然后将其导入我的c ++应用程序。由于我需要更多的资源,我决定将模型分发到几个GPU上。之后我无法导入模型。
这是我之前创建模型的方式:
input_img = Input(shape=(imgDim, imgDim, 1))
# add several layers to net
model = Model(input_img, net)
model.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
model.fit(x_train, y_train,
epochs=100,
batch_size=100,
shuffle=True,
validation_data=(x_test, y_test))
saveKerasModelAsProtobuf(model, outpath)
这是我导出模型的方式:
def saveKerasModelAsProtobuf(model, outputPath):
signature = tf.saved_model.signature_def_utils.predict_signature_def(
inputs={'image': model.input}, outputs={'scores': model.output})
builder = tf.saved_model.builder.SavedModelBuilder(outputPath)
builder.add_meta_graph_and_variables(
sess=keras.backend.get_session(),
tags=[tf.saved_model.tag_constants.SERVING],
signature_def_map={
tf.saved_model.signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY:
signature
}
)
builder.save()
return
这就是我改变代码以在多个GPU上运行的方式:
input_img = Input(shape=(imgDim, imgDim, 1))
# add several layers to net
model = Model(input_img, net)
parallel_model = multi_gpu_model(model, gpus=4)
parallel_model.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
parallel_model.fit(x_train, y_train,
epochs=100,
batch_size=100,
shuffle=True,
validation_data=(x_test, y_test))
# export model rather than parallel_model:
saveKerasModelAsProtobuf(model, outpath)
当我尝试在单个GPU机器上用C ++导入模型时,我得到以下错误,表明它实际上不是顺序模型(正如我所期望的那样),而是parallel_model:
Cannot assign a device for operation 'replica_3/lambda_4/Shape': Operation was explicitly assigned to /device:GPU:3 but available devices are [ /job:localhost/replica:0/task:0/device:CPU:0 ]. Make sure the device specification refers to a valid device.
[[Node: replica_3/lambda_4/Shape = Shape[T=DT_FLOAT, _output_shapes=[[4]], out_type=DT_INT32, _device="/device:GPU:3"](input_1)]]
从我读到的内容来看,他们应该分享相同的权重,而不是内部结构。我究竟做错了什么?是否有更好/更通用的方式来导出模型?
谢谢!