问题:使用Azure机器学习服务部署模型

时间:2020-01-31 13:58:57

标签: python-3.x azure web-services web-deployment

我成功使用Azure Machine Learning服务创建了分类器模型,成功注册了模型后,我为容器实例构建了正确的环境,从而提供了评分文件,环境文件和配置文件,但不幸的是,当我部署解决方案时却出现了错误这是我的部署服务日志,以获取更多详细信息:

服务日志

2020-02-07T06:21:10,612616835+00:00 - rsyslog/run 
2020-02-07T06:21:10,616528746+00:00 - iot-server/run 
2020-02-07T06:21:10,617958751+00:00 - gunicorn/run 
2020-02-07T06:21:10,627065178+00:00 - nginx/run 
EdgeHubConnectionString and IOTEDGE_IOTHUBHOSTNAME are not set. Exiting...
2020-02-07T06:21:11,108893523+00:00 - iot-server/finish 1 0
2020-02-07T06:21:11,116794547+00:00 - Exit code 1 is normal. Not restarting iot-server.
Starting gunicorn 19.9.0
Listening at: http://127.0.0.1:31311 (12)
Using worker: sync
worker timeout is set to 300
Booting worker with pid: 45
Initializing logger
Starting up app insights client
Starting up request id generator
Starting up app insight hooks
Invoking user's init function
2020-02-07 06:21:15,494 | azureml.core.run | DEBUG | Could not load run context RunEnvironmentException:
    Message: Could not load a submitted run, if outside of an execution context, use experiment.start_logging to initialize an azureml.core.Run.
    InnerException None
    ErrorResponse 
{
    "error": {
        "message": "Could not load a submitted run, if outside of an execution context, use experiment.start_logging to initialize an azureml.core.Run."
    }
}, switching offline: False
2020-02-07 06:21:15,495 | azureml.core.run | DEBUG | Could not load the run context and allow_offline set to False
2020-02-07 06:21:15,495 | azureml.core.model | DEBUG | Checking root for demo_Model.pkl because candidate dir azureml-models had 1 nodes: azureml-models/demomodel/8/demo_Model.pkl
User's init function failed
Encountered Exception Traceback (most recent call last):
  File "/var/azureml-server/aml_blueprint.py", line 163, in register
    main.init()
  File "/var/azureml-app/main.py", line 88, in init
    driver_module.init()
  File "score.py", line 13, in init
    model_path = Model.get_model_path('demo_Model.pkl')
  File "/opt/miniconda/lib/python3.6/site-packages/azureml/core/model.py", line 697, in get_model_path
    return Model._get_model_path_local(model_name, version)
  File "/opt/miniconda/lib/python3.6/site-packages/azureml/core/model.py", line 718, in _get_model_path_local
    return Model._get_model_path_local_from_root(model_name)
  File "/opt/miniconda/lib/python3.6/site-packages/azureml/core/model.py", line 761, in _get_model_path_local_from_root
    "set logging level to DEBUG.".format(candidate_model_path))
azureml.exceptions._azureml_exception.ModelNotFoundException: ModelNotFoundException:
    Message: Model not found in cache or in root at ./demo_Model.pkl. For more info,set logging level to DEBUG.
    InnerException None
    ErrorResponse 
{
    "error": {
        "message": "Model not found in cache or in root at ./demo_Model.pkl. For more info,set logging level to DEBUG."
    }
}

/opt/miniconda/lib/python3.6/site-packages/sklearn/externals/joblib/__init__.py:15: FutureWarning: sklearn.externals.joblib is deprecated in 0.21 and will be removed in 0.23. Please import this functionality directly from joblib, which can be installed with: pip install joblib. If this warning is raised when loading pickled models, you may need to re-serialize those models with scikit-learn 0.21+.
  warnings.warn(msg, category=FutureWarning)
Worker exiting (pid: 45)
Shutting down: Master
Reason: Worker failed to boot.
2020-02-07T06:21:15,663509630+00:00 - gunicorn/finish 3 0
2020-02-07T06:21:15,664398433+00:00 - Exit code 3 is not normal. Killing image.

错误 正在运行................................................. ................................................... ................................................... ................................................... .......................................

TimedOut

ERROR - Service deployment polling reached non-successful terminal state, current service state: Unhealthy
More information can be found using '.get_logs()'
Error:
{
  "code": "DeploymentTimedOut",
  "statusCode": 504,
  "message": "The deployment operation polling has TimedOut. The service creation is taking longer than our normal time. We are still trying to achieve the desired state for the web service. Please check the webservice state for the current webservice health. From SDK you can run print(service.state) to know the current state of the webservice."
}

ERROR - Service deployment polling reached non-successful terminal state, current service state: Unhealthy
More information can be found using '.get_logs()'
Error:
{
  "code": "DeploymentTimedOut",
  "statusCode": 504,
  "message": "The deployment operation polling has TimedOut. The service creation is taking longer than our normal time. We are still trying to achieve the desired state for the web service. Please check the webservice state for the current webservice health. From SDK you can run print(service.state) to know the current state of the webservice."
}

---------------------------------------------------------------------------
WebserviceException                       Traceback (most recent call last)
~/anaconda3_501/lib/python3.6/site-packages/azureml/core/webservice/webservice.py in wait_for_deployment(self, show_output)
    530                                           'Error:\n'
--> 531                                           '{}'.format(self.state, logs_response, error_response), logger=module_logger)
    532             print('{} service creation operation finished, operation "{}"'.format(self._webservice_type,

WebserviceException: WebserviceException:
    Message: Service deployment polling reached non-successful terminal state, current service state: Unhealthy
More information can be found using '.get_logs()'
Error:
{
  "code": "DeploymentTimedOut",
  "statusCode": 504,
  "message": "The deployment operation polling has TimedOut. The service creation is taking longer than our normal time. We are still trying to achieve the desired state for the web service. Please check the webservice state for the current webservice health. From SDK you can run print(service.state) to know the current state of the webservice."
}
    InnerException None
    ErrorResponse 
{
    "error": {
        "message": "Service deployment polling reached non-successful terminal state, current service state: Unhealthy\nMore information can be found using '.get_logs()'\nError:\n{\n  \"code\": \"DeploymentTimedOut\",\n  \"statusCode\": 504,\n  \"message\": \"The deployment operation polling has TimedOut. The service creation is taking longer than our normal time. We are still trying to achieve the desired state for the web service. Please check the webservice state for the current webservice health. From SDK you can run print(service.state) to know the current state of the webservice.\"\n}"
    }
}

During handling of the above exception, another exception occurred:

WebserviceException                       Traceback (most recent call last)
<timed exec> in <module>

~/anaconda3_501/lib/python3.6/site-packages/azureml/core/webservice/webservice.py in wait_for_deployment(self, show_output)
    538                                           'Current state is {}'.format(self.state), logger=module_logger)
    539             else:
--> 540                 raise WebserviceException(e.message, logger=module_logger)
    541 
    542     def _wait_for_operation_to_complete(self, show_output):

WebserviceException: WebserviceException:
    Message: Service deployment polling reached non-successful terminal state, current service state: Unhealthy
More information can be found using '.get_logs()'
Error:
{
  "code": "DeploymentTimedOut",
  "statusCode": 504,
  "message": "The deployment operation polling has TimedOut. The service creation is taking longer than our normal time. We are still trying to achieve the desired state for the web service. Please check the webservice state for the current webservice health. From SDK you can run print(service.state) to know the current state of the webservice."
}
    InnerException None
    ErrorResponse 
{
    "error": {
        "message": "Service deployment polling reached non-successful terminal state, current service state: Unhealthy\nMore information can be found using '.get_logs()'\nError:\n{\n  \"code\": \"DeploymentTimedOut\",\n  \"statusCode\": 504,\n  \"message\": \"The deployment operation polling has TimedOut. The service creation is taking longer than our normal time. We are still trying to achieve the desired state for the web service. Please check the webservice state for the current webservice health. From SDK you can run print(service.state) to know the current state of the webservice.\"\n}"
    }
}

这就是我的网络服务代码的样子:

%%time
from azureml.core.webservice import Webservice
from azureml.core.model import Model
from azureml.core.model import InferenceConfig
from azureml.core.environment import Environment

myenv = Environment.from_conda_specification(name="myenv", file_path="myenv.yml")
inference_config = InferenceConfig(entry_script="score.py", environment=myenv)

service = Model.deploy(workspace=ws,
                       name='myimage',
                       models=[model], 
                       inference_config=inference_config,
                       deployment_config=aciconfig)

service.wait_for_deployment(show_output=True)

谁能告诉我这到底意味着什么?我该如何解决?

谢谢

艾哈迈德

1 个答案:

答案 0 :(得分:0)

更新scikit-learn的版本可以在我的环境中解决该问题。

如下所示,将版本指定为myenv.yml。 (在我的环境中,最初安装了0.20.3,并通过更新为0.22.1解决了该问题)

name: project_environment
dependencies:
  # The python interpreter version.
  # Currently Azure ML only supports 3.5.2 and later.
- python=3.6.2

- pip:
  - azureml-defaults
- scikit-learn=0.22.1
channels:
- conda-forge