TFS MultiInference rpc方法中的错误(未定义的行为)

时间:2019-02-05 10:14:57

标签: tensorflow tensorflow-serving

我正在尝试使用MultiInference方法在一个请求(A / B测试)中检查某个模型的两个不同版本。但是在某些情况下,我遇到了错误Duplicate evaluation of signature: classification,在另一种情况下,我得到了非常奇怪的结果。

示例:

  1. 当我在版本7和8中使用模型“ stpeter”并且signature_name =“ classification”(两个任务)时,就是这种情况。
Input request:
tasks {
  model_spec {
    name: "stpeter"
    version {
      value: 7
    }
    signature_name: "classification"
  }
  method_name: "tensorflow/serving/classify"
}
tasks {
  model_spec {
    name: "stpeter"
    version {
      value: 8
    }
    signature_name: "classification"
  }
  method_name: "tensorflow/serving/classify"
}
input {
  example_list {
    examples {
      features {
        feature {
          key: "inputs"
          value {
            bytes_list {
              value: "ala.kowalska"
            }
          }
        }
      }
    }
  }
}

Traceback (most recent call last):
  File "ab_test.py", line 146, in <module>
    do_inference(args)
  File "ab_test.py", line 123, in do_inference
    results = stub.MultiInference(request, 10)
  File "/anaconda3/envs/ents/lib/python3.6/site-packages/grpc/_channel.py", line 533, in __call__
    return _end_unary_response_blocking(state, call, False, None)
  File "/anaconda3/envs/ents/lib/python3.6/site-packages/grpc/_channel.py", line 467, in _end_unary_response_blocking
    raise _Rendezvous(state, None, None, deadline)
grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with:
    status = StatusCode.INVALID_ARGUMENT
    details = "Duplicate evaluation of signature: classification"
    debug_error_string = "{"created":"@1549359403.703597000","description":"Error received from peer","file":"src/core/lib/surface/call.cc","file_line":1017,"grpc_message":"Duplicate evaluation of signature: classification","grpc_status":3}"
>
  1. 因此,更改signature_name之一。
Input request:
tasks {
  model_spec {
    name: "stpeter"
    version {
      value: 7
    }
    signature_name: "classification"
  }
  method_name: "tensorflow/serving/classify"
}
tasks {
  model_spec {
    name: "stpeter"
    version {
      value: 8
    }
  }
  method_name: "tensorflow/serving/classify"
}
input {
  example_list {
    examples {
      features {
        feature {
          key: "inputs"
          value {
            bytes_list {
              value: "ala.kowalska"
            }
          }
        }
      }
    }
  }
}

Results:
results {
  model_spec {
    name: "stpeter"
    version {
      value: 7
    }
    signature_name: "classification"
  }
  classification_result {
    classifications {
      classes {
        label: "BOT"
        score: 0.010155047290027142
      }
      classes {
        label: "HUMAN"
        score: 0.9898449182510376
      }
    }
  }
}
results {
  model_spec {
    name: "stpeter"
    version {
      value: 7
    }
    signature_name: "serving_default"
  }
  classification_result {
    classifications {
      classes {
        label: "BOT"
        score: 0.010155047290027142
      }
      classes {
        label: "HUMAN"
        score: 0.9898449182510376
      }
    }
  }
}

它似乎运行良好(没有错误)。但是,让我们仔细看看结果。尽管任务#2定义了version {value: 8},我们仍可以从版本7的stpeter中找到这两个任务的答案(signature_name =“ classification”和signature_name =“ serving_default”)。

使用Tensorflow估算器创建服务模型,并使用export_savedmodel保存。因为我们有可用的签名:

INFO:tensorflow:Signatures INCLUDED in export for Classify: ['serving_default', 'classification']
INFO:tensorflow:Signatures INCLUDED in export for Regress: ['regression']
INFO:tensorflow:Signatures INCLUDED in export for Predict: ['predict']

inference.proto中的型号版本没有限制。

我也检查了TFS test cases,但似乎我的案子没有得到检查。

如果只提供一个小技巧可以帮助我解决这个问题,我将不胜感激。

1 个答案:

答案 0 :(得分:0)

我认为造成混淆的原因是,在模型版本粒度级别上将MultiInference请求设计为应用于单个保存的模型。实际上,请求中的第一个模型规范指定的模型版本是唯一重要的模型版本(请参见here)。 强制执行相同的模型版本可能更清洁,就像已经强制执行here一样的模型名称一样。 然后,上面的注释将被更新为: “ MultiInferenceRequest中的所有ModelSpec必须访问相同的模型名称和版本。”

很高兴为您解决其他问题。