Question

当尝试引用/加载使用blob存储中的数据源文件生成的dsource或dprep文件时，我收到错误“没有给定路径的文件”。

使用.py和.ipynb文件进行测试。这是代码：

# Use the Azure Machine Learning data source package
from azureml.dataprep import datasource

df = datasource.load_datasource('POS.dsource') #Error generated here

# Remove this line and add code that uses the DataFrame
df.head(10)

请告诉我其他哪些信息会有所帮助。谢谢！

Answer 1

遇到同样的问题，需要进行一些研究才能搞清楚！

目前，来自blob存储的数据源文件为only supported for two cluster type： Azure HDInsight PySpark 和 Docker（Linux VM）PySpark

为了使其发挥作用，有必要遵循Configuring Azure Machine Learning Experimentation Service中的说明。

在提交第一个命令之前，我还运行az ml experiment prepare -c <compute_name>来安装集群上的所有依赖项，因为该部署需要相当长的时间（对于我的D12 v2集群至少需要10分钟。）

使用HDInsight PySpark计算群集运行.py文件（对于Azure blob中存储的数据。）但.ipynb文件仍无法在我的本地Jupyter服务器上运行 - 单元格永远不会完成。< / p>

Answer 2

我来自Azure机器学习团队 - 很抱歉您遇到了Jupyter笔记本电脑的问题。您是否尝试过从CLI运行笔记本电脑？如果从CLI运行，您应该看到stderr / stdout。 WB中的IFrame吞下实际的错误消息。这可能有助于您排除故障。

来自Blob的Azure ML Workbench文件

2 个答案: