无法使用已部署的TF BERT模型从SavedModel获取GCloud在线预测:“错误请求”错误

时间:2019-12-02 10:08:42

标签: tensorflow google-cloud-ml

我基于this notebook训练了BERT模型。

我以这种方式将其导出为tf SavedModel:

def serving_input_fn():
    receiver_tensors = {
        "input_ids": tf.placeholder(dtype=tf.int32, shape=[1, MAX_SEQ_LENGTH])
    }

    features = {
        "input_ids": receiver_tensors['input_ids'],
        "input_mask": 1 - tf.cast(tf.equal(receiver_tensors['input_ids'], 0), dtype=tf.int32),
        "segment_ids": tf.zeros(dtype=tf.int32, shape=[1, MAX_SEQ_LENGTH]),
        "label_ids": tf.placeholder(tf.int32, [None], name='label_ids')
    }
    return tf.estimator.export.ServingInputReceiver(features, receiver_tensors)


estimator._export_to_tpu = False
estimator.export_saved_model("export", serving_input_fn)

然后,如果我尝试在本地使用保存的模型,它将起作用:

from tensorflow.contrib import predictor

predict_fn = predictor.from_saved_model("export/1575241274/")

print(predict_fn({
    "input_ids": [[101, 10468, 99304, 11496, 171, 112, 10176, 22873, 119, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]]
}))

# {'probabilities': array([[-0.01023898, -4.5866656 ]], dtype=float32), 'labels': 0}

然后,我将SavedModel上传到存储桶,并通过以下方式在gcloud上创建了模型和模型版本:

gcloud alpha ai-platform versions create v1gpu --model [...] --origin=[...] --python-version=3.5 --runtime-version=1.14 --accelerator=^:^count=1:type=nvidia-tesla-k80 --machine-type n1-highcpu-4

没有问题,模型已部署并在控制台中显示为正常工作。

但是,如果我尝试获得预测,例如:

import googleapiclient.discovery

service = googleapiclient.discovery.build('ml', 'v1')
name = 'projects/[project_name]/models/[model_name]/versions/v1gpu'

response = service.projects().predict(
        name=name,
        body={'instances': [{
    "input_ids": [[101, 10468, 99304, 11496, 171, 112, 10176, 22873, 119, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]]
}]}
).execute()

print(response["predictions"])

我得到的只是以下错误:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.6/dist-packages/googleapiclient/_helpers.py", line 130, in positional_wrapper
    return wrapped(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/googleapiclient/http.py", line 851, in execute
    raise HttpError(resp, content, uri=self.uri)
googleapiclient.errors.HttpError: <HttpError 400 when requesting https://ml.googleapis.com/v1/projects/[project_name]/models/[model_name]/versions/v1gpu:predict?alt=json returned "Bad Request">

如果我使用“使用示例输入数据测试模型”功能从gcloud控制台测试模型,则会收到相同的错误。

编辑:

Saved_model具有一个标记集“ serve”和一个特征符定义“ serving_default”。

“ saved_model_cli show --dir 1575241274 / --tag_set服务--signature_def服务默认值”的输出:

The given SavedModel SignatureDef contains the following input(s):
  inputs['input_ids'] tensor_info:
      dtype: DT_INT32
      shape: (1, 128)
      name: Placeholder:0
The given SavedModel SignatureDef contains the following output(s):
  outputs['labels'] tensor_info:
      dtype: DT_INT32
      shape: ()
      name: loss/Squeeze:0
  outputs['probabilities'] tensor_info:
      dtype: DT_FLOAT
      shape: (1, 2)
      name: loss/LogSoftmax:0
Method name is: tensorflow/serving/predict

2 个答案:

答案 0 :(得分:0)

发送到API的请求的正文具有以下形式:

{"instances": [<instance 1>, <instance 2>, ...]}

根据文档中的说明,我们需要以下内容:

{
    "instances": [
        <object>
        ...
    ]
}

在这种情况下,您有:

{ 
    "instances": [ 
        {
           "input_ids": 
             [ <object> ] 
        }

     ...
    ]
}

您需要将input_id替换为实例:

{  
    "instances": 
     [
        [101, 10468, 99304, 11496, 171, 112, 10176, 22873, 119, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
     ]
}

注意。如果可以显示saved_model_cli会很好。

另外,gcloud local predict命令也是测试的不错选择。

答案 1 :(得分:0)

这取决于模型的签名。就我而言,我具有以下签名(仅保留输入部分):

The given SavedModel SignatureDef contains the following input(s):
  inputs['attention_mask'] tensor_info:
      dtype: DT_INT32
      shape: (-1, 128)
      name: serving_default_attention_mask:0
  inputs['input_ids'] tensor_info:
      dtype: DT_INT32
      shape: (-1, 128)
      name: serving_default_input_ids:0
  inputs['token_type_ids'] tensor_info:
      dtype: DT_INT32
      shape: (-1, 128)
      name: serving_default_token_type_ids:0

并且我需要以以下格式(在本例中为2个示例)传递数据:

{'instances': 
  [
    {'input_ids': [101, 143, 18267, 15470, 90395, ...], 
     'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1, .....], 
     'token_type_ids': [0, 0, 0, 0, 0, 0, 0, 0, .....]
     }, 
     {'input_ids': [101, 17664, 143, 30728, .........], 
      'attention_mask': [1, 1, 1, 1, 1, 1, 1, .......], 
      'token_type_ids': [0, 0, 0, 0, 0, 0, 0, 0, ....]
      }
  ]
}

我将其与Keras Tensorflow

2.2.0模型一起使用

我想根据您的情况(两个示例):

{'instances': 
  [
    {'input_ids': [101, 143, 18267, 15470, 90395, ...]}, 
    {'input_ids': [101, 17664, 143, 30728, .........]}
  ]
}