我需要使用Databricks REST API 2.0将许多笔记本(Python和Scala)导入Databricks
我的源路径(本地计算机)为./db_code
,目的地(Databricks工作区)为/Users/dmitriy@kagarlickij.com
我正在尝试建立2.0/workspace/import通话,所以我的身体是:{ "content": "$SOURCES_PATH", "path": "$DESTINATION_PATH", "format": "SOURCE", "language": "SCALA", "overwrite": true }
但是我遇到错误:Could not parse request object: Illegal character
,根据文档content
必须为The base64-encoded content
我应该将所有笔记本编码为base64吗?
也许有一些使用API将目录导入Databricks的示例?
答案 0 :(得分:1)
经过一些研究,我设法使它起作用:
Write-Output "Task: Create Databricks Directory Structure"
Get-ChildItem "$SOURCES_PATH" -Recurse -Directory |
Foreach-Object {
$DIR = $_.FullName.split("$WORKDIR_PATH/")[1]
$BODY = @{
"path" = "$DESTINATION_PATH/$DIR"
}
$BODY_JSON = $BODY | ConvertTo-Json
Invoke-RestMethod -Method POST -Uri "https://$DATABRICKS_REGION.azuredatabricks.net/api/2.0/workspace/mkdirs" -Headers $HEADERS -Body $BODY_JSON | Out-Null
}
Write-Output "Task: Deploy Scala notebooks to Databricks"
Get-ChildItem "$SOURCES_PATH" -Recurse -Include *.scala |
Foreach-Object {
$NOTEBOOK_NAME = $_.FullName.split("$WORKDIR_PATH/")[1]
$NOTEBOOK_BASE64 = [Convert]::ToBase64String([IO.File]::ReadAllBytes("$_"))
$BODY = @{
"content" = "$NOTEBOOK_BASE64"
"path" = "$DESTINATION_PATH/$NOTEBOOK_NAME"
"language" = "SCALA"
"overwrite" = "true"
"format" = "SOURCE"
}
$BODY_JSON = $BODY | ConvertTo-Json
Invoke-RestMethod -Method POST -Uri "https://$DATABRICKS_REGION.azuredatabricks.net/api/2.0/workspace/import" -Headers $HEADERS -Body $BODY_JSON | Out-Null
}
答案 1 :(得分:0)
是的,笔记本必须被编码为base64。您可以使用Powershell来实现。请尝试以下。
$BinaryContents = [System.IO.File]::ReadAllBytes("$SOURCES_PATH")
$EncodedContents = [System.Convert]::ToBase64String($BinaryContents)
REST调用的主体如下所示。
{
"format": "SOURCE",
"content": "$EncodedContents",
"path": "$DESTINATION_PATH",
"overwrite": "true",
"language": "Scala"
}
关于导入整个目录,您可以使用Powershell在本地目录中循环浏览文件和文件夹,并对每个笔记本进行REST调用。