在AML服务中挂载从Azure数据湖创建的数据集

时间:2019-10-24 19:17:42

标签: azure-machine-learning-service

我在装入数据集时遇到问题(从蔚蓝数据湖数据存储区创建)。我按名称下载了数据集,并试图将其作为输入传递给Tensorflow估计器。我提供的脚本参数如下:

'--data-folder': dataset.as_named_input('trainigdata').as_mount('tmp/dataset')

但是我得到以下异常:

Mounting trainigdata to tmp/dataset
ERROR - Uncaught exception from FUSE operation opendir, returning errno.EINVAL.
Traceback (most recent call last):
  File "/azureml-envs/azureml_f73412f070d144d39c8a826b53bde771/lib/python3.6/site-packages/fuse.py", line 734, in _wrapper
    return func(*args, **kwargs) or 0
  File "/azureml-envs/azureml_f73412f070d144d39c8a826b53bde771/lib/python3.6/site-packages/fuse.py", line 954, in opendir
    path.decode(self.encoding))
  File "/azureml-envs/azureml_f73412f070d144d39c8a826b53bde771/lib/python3.6/site-packages/fuse.py", line 1076, in __call__
    return getattr(self, op)(*args)
  File "/azureml-envs/azureml_f73412f070d144d39c8a826b53bde771/lib/python3.6/site-packages/azureml/dataprep/fuse/dprepfuse.py", line 297, in opendir
    self._open_dirs[path] = self._list_entries(path)
  File "/azureml-envs/azureml_f73412f070d144d39c8a826b53bde771/lib/python3.6/site-packages/azureml/dataprep/fuse/dprepfuse.py", line 145, in _list_entries
    .to_pandas_dataframe(extended_types=True)
  File "/azureml-envs/azureml_f73412f070d144d39c8a826b53bde771/lib/python3.6/site-packages/azureml/dataprep/api/_loggerfactory.py", line 131, in wrapper
    return func(*args, **kwargs)
  File "/azureml-envs/azureml_f73412f070d144d39c8a826b53bde771/lib/python3.6/site-packages/azureml/dataprep/api/dataflow.py", line 706, in to_pandas_dataframe
    ExecuteAnonymousActivityMessageArguments(anonymous_activity=Dataflow._dataflow_to_anonymous_activity_data(dataflow_to_execute)))
  File "/azureml-envs/azureml_f73412f070d144d39c8a826b53bde771/lib/python3.6/site-packages/azureml/dataprep/api/_aml_helper.py", line 38, in wrapper
    return send_message_func(op_code, message, cancellation_token)
  File "/azureml-envs/azureml_f73412f070d144d39c8a826b53bde771/lib/python3.6/site-packages/azureml/dataprep/api/engineapi/api.py", line 88, in execute_anonymous_activity
    response = self._message_channel.send_message('Engine.ExecuteActivity', message_args, cancellation_token)
  File "/azureml-envs/azureml_f73412f070d144d39c8a826b53bde771/lib/python3.6/site-packages/azureml/dataprep/api/engineapi/engine.py", line 74, in send_message
    raise_engine_error(response['error'])
  File "/azureml-envs/azureml_f73412f070d144d39c8a826b53bde771/lib/python3.6/site-packages/azureml/dataprep/api/errorhandlers.py", line 22, in raise_engine_error
    raise ExecutionError(error_response)
azureml.dataprep.api.errorhandlers.ExecutionError: Could not execute the specified transform.|session_id=101b574b-cdd2-4975-a5bd-0e57c9fc061f
Logging warning in history service: ERROR:: Dataset  failed. . Exception Details:Traceback (most recent call last):
  File "/mnt/batch/tasks/shared/LS_root/jobs/env/azureml/trainprediction_aks_1571941512_8d9344d7/mounts/workspaceblobstore/azureml/trainprediction_AKS_1571941512_8d9344d7/azureml-setup/context_managers.py", line 208, in __enter__
    self.datasets.__enter__()
  File "/azureml-envs/azureml_f73412f070d144d39c8a826b53bde771/lib/python3.6/site-packages/azureml/data/context_managers.py", line 119, in __enter__
    context_manager.__enter__()
  File "/azureml-envs/azureml_f73412f070d144d39c8a826b53bde771/lib/python3.6/site-packages/azureml/dataprep/fuse/daemon.py", line 92, in __enter__
    self._wait_until_mounted()
  File "/azureml-envs/azureml_f73412f070d144d39c8a826b53bde771/lib/python3.6/site-packages/azureml/dataprep/fuse/daemon.py", line 142, in _wait_until_mounted
    while not os.path.exists(self.mount_point) or len(os.listdir(self.mount_point)) == 0:
OSError: [Errno 22] Invalid argument: '/mnt/batch/tasks/shared/LS_root/jobs/env/azureml/trainprediction_aks_1571941512_8d9344d7/mounts/workspaceblobstore/azureml/trainprediction_AKS_1571941512_8d9344d7/tmp/dataset'

有人可以帮忙吗

2 个答案:

答案 0 :(得分:0)

当前,我们有blob API可以开始安装ADLS Gen2,以运行培训作业。 更简单的解决方案是注册MPA(https://docs.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-multi-protocol-access) 这样,您就可以将ADLS gen2中的本机Datastore API和安装点用于AML。

在进行数据访问时是否要支持文件夹级别的ACL,如果是,则当前不支持ACL,它正在开发中。

答案 1 :(得分:0)

不幸的是,我无法使用最新的azureml-sdk重现此错误。

除了使用相对的安装路径,还可以尝试以下操作:

'--data-folder': dataset.as_named_input('trainigdata').as_mount('/tmp/dataset')

具体来说,将tmp/dataset更改为/tmp/dataset吗?

谢谢!