Azure Databricks API:使用笔记本导入整个目录

时间:2020-06-29 11:02:16

标签: azure-databricks

我需要使用Databricks REST API 2.0将许多笔记本(Python和Scala)导入Databricks

我的源路径(本地计算机)为./db_code,目的地(Databricks工作区)为/Users/dmitriy@kagarlickij.com

我正在尝试建立2.0/workspace/import通话,所以我的身体是:{ "content": "$SOURCES_PATH", "path": "$DESTINATION_PATH", "format": "SOURCE", "language": "SCALA", "overwrite": true }

但是我遇到错误:Could not parse request object: Illegal character,根据文档content必须为The base64-encoded content

我应该将所有笔记本编码为base64吗?

也许有一些使用API​​将目录导入Databricks的示例?

2 个答案:

答案 0 :(得分:1)

经过一些研究,我设法使它起作用:

Write-Output "Task: Create Databricks Directory Structure"
Get-ChildItem "$SOURCES_PATH" -Recurse -Directory |
Foreach-Object {
    $DIR = $_.FullName.split("$WORKDIR_PATH/")[1]
    $BODY = @{
        "path" = "$DESTINATION_PATH/$DIR"
    }
    $BODY_JSON = $BODY | ConvertTo-Json
    Invoke-RestMethod -Method POST -Uri "https://$DATABRICKS_REGION.azuredatabricks.net/api/2.0/workspace/mkdirs" -Headers $HEADERS -Body $BODY_JSON | Out-Null
}

Write-Output "Task: Deploy Scala notebooks to Databricks"
Get-ChildItem "$SOURCES_PATH" -Recurse -Include *.scala |
Foreach-Object {
    $NOTEBOOK_NAME = $_.FullName.split("$WORKDIR_PATH/")[1]
    $NOTEBOOK_BASE64 = [Convert]::ToBase64String([IO.File]::ReadAllBytes("$_"))
    $BODY = @{
        "content" = "$NOTEBOOK_BASE64"
        "path" = "$DESTINATION_PATH/$NOTEBOOK_NAME"
        "language" = "SCALA"
        "overwrite" = "true"
        "format" = "SOURCE"
    }
    $BODY_JSON = $BODY | ConvertTo-Json
    Invoke-RestMethod -Method POST -Uri "https://$DATABRICKS_REGION.azuredatabricks.net/api/2.0/workspace/import" -Headers $HEADERS -Body $BODY_JSON | Out-Null
}

答案 1 :(得分:0)

是的,笔记本必须被编码为base64。您可以使用Powershell来实现。请尝试以下。

    $BinaryContents = [System.IO.File]::ReadAllBytes("$SOURCES_PATH")
    $EncodedContents = [System.Convert]::ToBase64String($BinaryContents)

REST调用的主体如下所示。

{
    "format": "SOURCE",
    "content": "$EncodedContents",
    "path": "$DESTINATION_PATH",
    "overwrite": "true",
    "language": "Scala"
}

关于导入整个目录,您可以使用Powershell在本地目录中循环浏览文件和文件夹,并对每个笔记本进行REST调用。