Question

我在SageMaker端点中有一个TensorFlow Serving容器。我能够将一批图像作为一个Numpy数组来获取这样的预测：

import numpy as np
import sagemaker
from sagemaker.predictor import json_serializer, json_deserializer

image = np.random.uniform(low=-1.0, high=1.0, size=(1,128,128,3)).astype(np.float32)    
image = {'instances': image}
image = json_serializer(image)

request_args = {}
request_args['Body'] = image
request_args['EndpointName'] = endpoint_name
request_args['ContentType'] = 'application/json'
request_args['Accept'] = 'application/json'

# works successfully
response = sagemaker_session.sagemaker_runtime_client.invoke_endpoint(**request_args)
response_body = response['Body']
predictions = json_deserializer(response_body, response['ContentType'])

通过这种方式，request_args有效负载的大小很大。我想知道，有没有一种方法可以将其以更压缩的格式发送？

我已经尝试过base64和json.dumps的尝试，但是无法克服Invalid argument: JSON Value: ...的错误。不知道这是否不受支持，或者我是否做错了。

Answer 1

我已经与AWS支持人员进行了交谈（请参阅More efficient way to send a request than JSON to deployed tensorflow model in Sagemaker?）。

他们建议可以传入一个自定义的input_fn，供服务容器在其中可以解压缩格式（例如protobuf）的容器使用。

我将尽快对其进行测试，并希望它能起作用，因为它将为输入处理增加很多灵活性。

TensorFlow Serving将数据作为b64而不是Numpy数组发送

1 个答案: