因此,我已经在Tensorflow中训练了对象检测模型(YOLO-v3),并使用“ TF.saved_model”将其保存为服务格式,以便使用Tensorflow服务托管它。如果不使用TF服务,则每张图片的常规平均推断时间为35毫秒。
但是,当我尝试使用Tensorflow在同一台机器(localhost)上提供服务时,每张图片的推理时间增加到1秒(〜慢30倍!),我无法弄清原因和出了什么问题。
这是我用来将图形转换为TF服务的“ saved_model.pb”的代码:
import tensorflow as tf
IMAGE_H,IMAGE_W = 832,832
def read_pb_return_tensors(graph, pb_file, return_elements):
with tf.gfile.FastGFile(pb_file, 'rb') as f:
frozen_graph_def = tf.GraphDef()
frozen_graph_def.ParseFromString(f.read())
with graph.as_default():
return_elements = tf.import_graph_def(frozen_graph_def,
return_elements=return_elements)
input_tensor, output_tensors = return_elements[0], return_elements[1:]
return input_tensor, output_tensors
gpu_nms_graph = tf.Graph()
input_tensor, output_tensors = utils.read_pb_return_tensors(gpu_nms_graph, "./checkpoint/yolov3-logos-{}/yolov3-logos-{}_gpu_nms.pb".format(IMAGE_H,IMAGE_H),
["Placeholder:0", "concat_10:0", "concat_11:0", "concat_12:0"])
with tf.compat.v1.Session(graph=gpu_nms_graph) as sess:
if os.path.isdir(export_path):
os.rmdir(export_path)
builder = tf.saved_model.builder.SavedModelBuilder(export_path)
#builder = tf.compat.v1.saved_model.builder.SavedModelBuilder(export_path)
tensor_info_inputs = {
'inputs': tf.compat.v1.saved_model.utils.build_tensor_info(input_tensor)
}
tensor_info_outputs = {}
for k,v in enumerate(output_tensors):
tensor_info_outputs['output_{}'.format(k)] = tf.compat.v1.saved_model.utils.build_tensor_info(v)
detection_signature = (
tf.compat.v1.saved_model.signature_def_utils.build_signature_def(
inputs=tensor_info_inputs,
outputs=tensor_info_outputs,
method_name=tf.compat.v1.saved_model.signature_constants.PREDICT_METHOD_NAME))
builder.add_meta_graph_and_variables(
sess, [tf.compat.v1.saved_model.tag_constants.SERVING],
signature_def_map={
'Detect_logos':
detection_signature,
tf.compat.v1.saved_model.signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY:
detection_signature
},
main_op=tf.tables_initializer(),
strip_default_attrs=True)
builder.save()
我正在使用Tensorflow 1.14.0,tensorflow-model-server 1.14.0和tensorflow-serving-api 1.14.0
我还应该提到,几个月前,我使用相同的代码来提供非常相似的模型(也是单类YOLO),即使推理时间延迟也有所增加(从40毫秒到200毫秒),但它却更少重大。当时我正在使用旧版的tensorflow。