如何在Azure Databricks中从本地导入笔记本?
我在本地计算机上有DBC格式的示例笔记本,我需要通过Notebook Rest API导入。
curl -n -H "Content-Type: application/json" -X POST -d @- https://YOUR_DOMAIN/api/2.0/workspace/import <<JSON
{
"path": "/Users/user@example.com/new-notebook",
"format": "SOURCE",
"language": "SCALA",
"content": "Ly8gRGF0YWJyaWNrcyBub3RlYm9vayBzb3VyY2UKcHJpbnQoImhlbGxvLCB3b3JsZCIpCgovLyBDT01NQU5EIC0tLS0tLS0tLS0KCg==",
"overwrite": "false"
}
JSON
请参阅此doc
它们作为目标文件路径提供,但未提及源文件路径,而是作为内容提供。但是如何将源文件添加到导入笔记本?
答案 0 :(得分:3)
If you have a DBC file then the format needs to be DBC
and language
is ignored.
Also, the content
property needs to be the DBC file bytes Base64 encoded, per the docs:
The content parameter contains base64 encoded notebook content
If using bash you could simply do base64 notebook.dbc
答案 1 :(得分:0)
忽略源文件路径的原因是因为您应该将该文件转换为base64并将该字符串放入内容中。因此,路径变得无关紧要。
如果您不想这样做并且不介意使用curl,文档还说您也可以像这样进行管理:
curl -n -F path=/Users/user@example.com/project/ScalaExampleNotebook -F language=SCALA \
-F content=@example.scala \
https://<databricks-instance>/api/2.0/workspace/import
否则,如果您碰巧正在寻找如何导入目录...我花了几个小时来寻找自己。它使用databricks-cli库(在Python中)。
$ pip install databricks-cli
,然后
from databricks_cli.workspace.api import WorkspaceApi
from databricks_cli.sdk.api_client import ApiClient
client = ApiClient(
host='https://your.databricks-url.net',
token=api_key
)
workspace_api = WorkspaceApi(client)
workspace_api.import_workspace_dir(
source_path="/your/dir/here/MyProject",
target_path="/Users/user@example.com/MyProject",
overwrite=True,
exclude_hidden_files=True
)