Question

我创建并训练了一个 TensorFlow 模型，现在我想用它进行预测。为此，我使用 TensorFlow Serving 将其保存并加载到服务器中。

一旦服务器启动并准备好接收请求，我发送相同的请求 2 次，它们之间有足够的时间让模型在开始第二个预测之前结束第一个预测。我希望服务器返回的 2 个响应是相同的，因为它对应于同一模型对同一输入所做的预测。然而，事实并非如此。

有没有人遇到过同样的问题，或者知道是什么原因造成的？

我试过了：

更改 tensorflow_model_server 和 tensorflow_serving_api 版本（2.1.0 和 2.4.1）
在不同的机器上运行服务器
比较发送到服务器的请求对象，以确保它们相同

但是这些都没有解决我的问题。

还有：

我使用 gRPC 与包含我的模型的服务器通信请求，并遵循 this guide
这是saved_model_cli show --dir {path} --all的返回

：

MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:

signature_def['__saved_model_init_op']:
  The given SavedModel SignatureDef contains the following input(s):
  The given SavedModel SignatureDef contains the following output(s):
    outputs['__saved_model_init_op'] tensor_info:
        dtype: DT_INVALID
        shape: unknown_rank
        name: NoOp
  Method name is: 

signature_def['model_densenet_2D_new']:
  The given SavedModel SignatureDef contains the following input(s):
    inputs['input'] tensor_info:
        dtype: DT_UINT8
        shape: (-1, 512, 512, 1)
        name: model_densenet_2D_new_input:0
  The given SavedModel SignatureDef contains the following output(s):
    outputs['semantic_labels'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 512, 512, 2)
        name: StatefulPartitionedCall:0
  Method name is: tensorflow/serving/predict

Defined Functions:
  Function Name: 'served_model'
    Option #1
      Callable with:
        Argument #1
          input: TensorSpec(shape=(None, 512, 512, 1), dtype=tf.uint8, name='input')

发送相同的请求时从 tensorflow 服务模型接收不同的响应

0 个答案: