我正在尝试在亚马逊sagemaker中运行我自己的算法容器,在部署时,我收到如下错误。
predictor = tree.deploy(1, 'ml.m4.xlarge', serializer=csv_serializer)
ValueError: Error hosting endpoint decision-trees-sample-2018-03-01-09-59-06-832: Failed Reason: The primary container for production variant AllTraffic did not pass the ping health check.
然后我这次运行相同的代码行,我的误差低于此值。
predictor = tree.deploy(1, 'ml.m4.xlarge', serializer=csv_serializer)
ClientError: An error occurred (ValidationException) when calling the CreateEndpoint operation: Cannot create already existing endpoint "arn:aws:sagemaker:us-east-1:69759707XXxXX:endpoint/decision-trees-sample-2018-03-01-09-59-06-832".
答案 0 :(得分:0)
查看此问题:https://github.com/awslabs/amazon-sagemaker-examples/issues/210
@djarpin写道:
ping运行状况检查消息是一个常见错误,可能由多个不同问题引起。通常,名为/ aws / sagemaker / Endpoints /的CloudWatch日志组中的错误消息将提供有关ping运行状况检查未通过的原因的更详细说明。
希望有所帮助!