我在服务客户端示例之后写了我的客户端 https://github.com/tensorflow/serving/blob/master/tensorflow_serving/example/inception_client.py
并且单个预测成本差不多600-700ms,之后我找到了一个博客。 https://towardsdatascience.com/tensorflow-serving-client-make-it-slimmer-and-faster-b3e5f71208fb,通过这样做,预测成本降至20毫秒,但我所做的只是将调用tf.contrib.util.make_tensor_proto替换为
dims = [tensor_shape_pb2.TensorShapeProto.Dim(size=1)]
tensor_shape_proto = tensor_shape_pb2.TensorShapeProto(dim=dims)
tensor_proto = tensor_pb2.TensorProto(
dtype=types_pb2.DT_FLOAT,
tensor_shape=tensor_shape_proto)
for vector in vectors:
for vector_item in vector:
tensor_proto.float_val.append(vector_item)
request.inputs['vectors'].CopyFrom(tensor_proto)
我只是直接导入相关模块,如下所示:
from tensorflow.core.framework import tensor_pb2
from tensorflow.core.framework import tensor_shape_pb2
from tensorflow.core.framework import types_pb2
我无法弄清楚,为什么这比示例代码更快,我甚至将make_tensor_proto impl代码复制到我的代码中,它仍然比上面快得多。他们执行相同的代码,结果如此不同?
感谢您的帮助
答案 0 :(得分:0)
通过替换该代码段,您从客户端程序中删除了tensorflow依赖项。这应该是您发现推理时间如此减少的主要原因。