Azure Databricks群集初始化脚本-从已装载的存储中安装转盘

时间:2020-04-07 12:27:05

标签: azure databricks azure-databricks

我有一个python wheel上传到安装在databricks服务中的azure存储帐户中。我正在尝试使用databricks documentation中所述的群集初始化脚本来安装轮子。

我的存储设备肯定已装入,并且文件路径对我而言正确。在笔记本中运行命令display(dbutils.fs.ls("/mnt/package-source"))会产生结果:

path: dbfs:/mnt/package-source/parser-3.0-py3-none-any.whl
name: parser-3.0-py3-none-any.whl

我尝试使用以下命令从群集初始化文件中安装滚轮:

/databricks/python/bin/pip install "dbfs:/mnt/package-source/parser-3.0-py3-none-any.whl"

,但是群集无法启动。它的日志给我一个错误,提示它找不到文件:

WARNING: Requirement 'dbfs:/mnt/package-source/parser-3.0-py3-none-any.whl' looks like a filename, but the file does not exist
ERROR: Could not install packages due to an EnvironmentError: [Errno 2] No such file or directory: '/dbfs:/mnt/package-source/parser-3.0-py3-none-any.whl'

我也这样尝试过:

/databricks/python/bin/pip install /mnt/package-source/parser-3.0-py3-none-any.whl

但我收到类似的错误:

WARNING: Requirement '/mnt/package-source/parser-3.0-py3-none-any.whl' looks like a filename, but the file does not exist
ERROR: Could not install packages due to an EnvironmentError: [Errno 2] No such file or directory: '/mnt/package-source/parser-3.0-py3-none-any.whl'

我什至尝试使用诸如../../mnt/package-source/...之类的相对路径,但无济于事。有人可以告诉我我在做什么错吗?

相关问题:Azure Databricks cluster init script - install python wheel

1 个答案:

答案 0 :(得分:2)

我使用相对路径来工作。事实证明../../mnt/不是正确的路径。它使用../../../dbfs/mnt/工作。使用bash ls命令探索文件系统只花了一点时间。

对于其他遇到相同问题的人,我建议从笔记本中开始以下内容:

%%sh
ls ../../../