Sagemaker预测本地实例,JSON错误

时间:2018-06-04 07:48:56

标签: python mxnet amazon-sagemaker

我试图在Sagemaker实例上的MXNet上制作转移学习方法。训练和服务从本地开始没有任何问题,我使用该python代码预测:

def predict_mx(net, fname):
    with open(fname, 'rb') as f:
      img = image.imdecode(f.read())
      plt.imshow(img.asnumpy())
      plt.show()
    data = transform(img, -1, test_augs)
    plt.imshow(data.transpose((1,2,0)).asnumpy()/255)
    plt.show()
    data = data.expand_dims(axis=0)
    return net.predict(data.asnumpy().tolist())

我检查了data.asnumpy().tolist()是否正常,并且pyplot绘制图像(firts是原始图像,第二个是调整大小的图像)。但net.predict引发了错误:

---------------------------------------------------------------------------
JSONDecodeError                           Traceback (most recent call last)
<ipython-input-171-ea0f1f5bdc72> in <module>()
----> 1 predict_mx(predictor.predict, './data2/burgers-imgnet/00103785.jpg')

<ipython-input-170-150a72b14997> in predict_mx(net, fname)
     30     plt.show()
     31     data = data.expand_dims(axis=0)
---> 32     return net(data.asnumpy().tolist())
     33 

~/Projects/Lab/ML/AWS/v/lib64/python3.6/site-packages/sagemaker/predictor.py in predict(self, data)
     89         if self.deserializer is not None:
     90             # It's the deserializer's responsibility to close the stream
---> 91             return self.deserializer(response_body, response['ContentType'])
     92         data = response_body.read()
     93         response_body.close()

~/Projects/Lab/ML/AWS/v/lib64/python3.6/site-packages/sagemaker/predictor.py in __call__(self, stream, content_type)
    290         """
    291         try:
--> 292             return json.load(codecs.getreader('utf-8')(stream))
    293         finally:
    294             stream.close()

/usr/lib64/python3.6/json/__init__.py in load(fp, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
    297         cls=cls, object_hook=object_hook,
    298         parse_float=parse_float, parse_int=parse_int,
--> 299         parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
    300 
    301 

/usr/lib64/python3.6/json/__init__.py in loads(s, encoding, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
    352             parse_int is None and parse_float is None and
    353             parse_constant is None and object_pairs_hook is None and not kw):
--> 354         return _default_decoder.decode(s)
    355     if cls is None:
    356         cls = JSONDecoder

/usr/lib64/python3.6/json/decoder.py in decode(self, s, _w)
    337 
    338         """
--> 339         obj, end = self.raw_decode(s, idx=_w(s, 0).end())
    340         end = _w(s, end).end()
    341         if end != len(s):

/usr/lib64/python3.6/json/decoder.py in raw_decode(self, s, idx)
    355             obj, end = self.scan_once(s, idx)
    356         except StopIteration as err:
--> 357             raise JSONDecodeError("Expecting value", s, err.value) from None
    358         return obj, end

JSONDecodeError: Expecting value: line 1 column 1 (char 0)

我试过json.dumps我的数据,并没有问题。

请注意,我还没有在AWS上部署该服务,我希望能够在制作更大的列车之前在本地测试模型和预测,并在以后提供服务。

感谢您的帮助

2 个答案:

答案 0 :(得分:1)

net.predict 的调用工作正常。

似乎您正在使用SageMaker Python SDK predict_fn 进行托管。调用 predict_fn 后,MXNet容器将尝试将您的预测序列化为JSON,然后再将其发送回客户端。您可以在此处查看执行此操作的代码:https://github.com/aws/sagemaker-mxnet-container/blob/master/src/mxnet_container/serve/transformer.py#L132

该容器无法序列化,因为 net.predict 不返回可序列化的对象。您可以通过返回列表来解决此问题:

return net.predict(data.asnumpy().tolist()).asnumpy().tolist()

另一种替代方法是使用 transform_fn 代替 prediction_fn ,以便您可以自行处理输出序列化。您可以在https://github.com/aws/sagemaker-python-sdk/blob/e93eff66626c0ab1f292048451c4c3ac7c39a121/examples/cli/host/script.py#L41

上看到 transform_fn 的示例

答案 1 :(得分:0)

您对从笔记本计算机传递到预测环境(在docker中)的数据进行反序列化遇到了问题,但是鉴于提供的代码,我无法重现该问题。使用MXNet估计器(例如from sagemaker.mxnet import MXNet)时,您可以在入口点脚本中实现transform_fn以反序列化数据并使用模型进行预测。如下例所示,在函数的开头使用json.loads

def transform_fn(net, data, input_content_type, output_content_type):
    """
    Transform a request using the Gluon model. Called once per request.
    :param net: The Gluon model.
    :param data: The request payload.
    :param input_content_type: The request content type.
    :param output_content_type: The (desired) response content type.
    :return: response payload and content type.
    """
    # we can use content types to vary input/output handling, but
    # here we just assume json for both
    parsed = json.loads(data)
    nda = mx.nd.array(parsed)
    output = net(nda)
    prediction = mx.nd.argmax(output, axis=1)
    response_body = json.dumps(prediction.asnumpy().tolist()[0])
    return response_body, output_content_type

如果data命令仍然存在问题,则应检查json.loads的值,并仔细查找与编码有关的问题(例如,以\开头的字符串无效)。

注意:您在函数和堆栈跟踪中还具有不同的代码,因此您可能希望确认自己正在运行自己认为正在运行的内容。而且您提到您尚未部署(本地或实例),但这是预测所必需的。