在AWS Sagemaker上部署scikit模型后,我使用以下方法调用我的模型:
import pandas as pd
payload = pd.read_csv('test3.csv')
payload_file = io.StringIO()
payload.to_csv(payload_file, header = None, index = None)
import boto3
client = boto3.client('sagemaker-runtime')
response = client.invoke_endpoint(
EndpointName= endpoint_name,
Body= payload_file.getvalue(),
ContentType = 'text/csv')
import json
result = json.loads(response['Body'].read().decode())
print(result)
上面的代码可以正常工作,但是当我尝试时:
payload = np.array([[100,5,1,2,3,4]])
我得到了错误:
ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received server error (500) from container-1 with message
"<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN"> <title>500 Internal Server Error</title> <h1>
Internal Server Error</h1> <p>The server encountered an internal error and was unable to complete your request.
Either the server is overloaded or there is an error in the application.</p>
Scikit-learn SageMaker Estimators and Models中提到
SageMaker Scikit学习模型服务器提供默认实现 输入_fn。此函数反序列化JSON,CSV或NPY编码的数据 放入NumPy数组。
我想知道如何修改默认值以接受2D numpy数组,以便将其用于实时预测。
有什么建议吗?我尝试使用Inference Pipeline with Scikit-learn and Linear Learner作为参考,但无法用Scikit模型代替Linear Learner。我收到了同样的错误。
答案 0 :(得分:0)
如果有人找到一种方法来更改默认的input_fn,predict_fn和output_fn以接受numpy数组或字符串,则请共享。
但是我确实找到了使用默认设置的方法。
import numpy as np
import pandas as pd
df = pd.DataFrame(np.array([[100.0,0.08276299999999992,77.24,0.0008276299999999992,43.56,
6.6000000000000005,69.60699488825647,66.0,583.0,66.0,6.503081996847735,44.765133295284,
0.4844340723821271,21.35599999999999],
[100.0,0.02812099999999873,66.24,0.0002855600000003733,43.56,6.6000000000000005,
1.6884635296354735,66.0,78.0,66.0,6.754543287329573,47.06480204081666,
0.42642318733140017,0.4703999999999951],
[100.0,4.374382,961.36,0.043743819999999996,25153.96,158.6,649.8146514292529,120.0,1586.0
,1512.0,-0.25255116297020636,1.2255274408634853,-2.5421402801039323,614.5056]]),
columns=['a', 'b', 'c','d','e','f','g','h','i','j','k','l','m','n'])
import io
from io import StringIO
test_file = io.StringIO()
df.to_csv(test_file,header = None, index = None)
然后:
import boto3
client = boto3.client('sagemaker-runtime')
response = client.invoke_endpoint(
EndpointName= endpoint_name,
Body= test_file.getvalue(),
ContentType = 'text/csv')
import json
result = json.loads(response['Body'].read().decode())
print(result)
但是,如果有更好的解决方案,那将非常有帮助。
答案 1 :(得分:0)
您应该能够为您的model.deploy()返回的预测变量设置序列化器/反序列化器。 FM示例笔记本中有一个这样做的示例:
Alexander Stepanov has written about it
请尝试一下,让我知道这是否适合您!