我正在尝试使用服务主体将Databricks连接到Synapse。 我已经在集群配置中配置了服务主体
fs.azure.account.auth.type.<datalake>.dfs.core.windows.net OAuth
fs.azure.account.oauth.provider.type org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider
fs.azure.account.oauth2.client.id <Service Principal ID/Application ID>
fs.azure.account.oauth2.client.secret <Client secret key/Service Principal Password>
fs.azure.account.oauth2.client.endpoint https://login.microsoftonline.com/<tenant-id>/oauth2/token
fs.azure.createRemoteFileSystemDuringInitialization true
虽然我可以成功连接到DataLake并正常工作,但是当我使用以下命令时,我无法写入突触...
DummyDF.write.format("com.databricks.spark.sqldw")\
.mode("append")\
.option("url", jdbcUrl)\
.option("useAzureMSI", "true")\
.option("tempDir",tempdir)\
.option("dbTable", "DummyTable").save()
我遇到以下错误...
Py4JJavaError: An error occurred while calling o831.save.
: com.databricks.spark.sqldw.SqlDWSideException: SQL DW failed to execute the JDBC query produced by the connector.
Underlying SQLException(s):
com.microsoft.sqlserver.jdbc.SQLServerException: External file access failed due to internal error: 'Error occurred while accessing HDFS: Java exception raised on call to HdfsBridge_IsDirExist. Java exception message:
HdfsBridge::isDirExist - Unexpected error encountered checking whether directory exists or not: AbfsRestOperationException: Operation failed: "This request is not authorized to perform this operation using this permission.", 403, HEAD, https://datalakename.dfs.core.windows.net/temp/2020-06-24/14-21-57-819/88228292-9f00-4da0-b778-d3421ea4d2ec?upn=false&timeout=90' [ErrorCode = 105019] [SQLState = S0001]
但是我可以使用以下命令写信给Synapse ...
DummyDF.write.mode("append").jdbc(jdbcUrl,"DummyTable")
我不确定缺少什么。
答案 0 :(得分:0)
第二个选项不使用Polybase,仅通过JDBC进行,并且速度较慢。
我认为您的错误与Databricks和SQL DW库无关,而与Synapse和存储之间的连通性有关。
您可以检查:
这里有一篇文章介绍解决与您的105019相同的错误代码: https://techcommunity.microsoft.com/t5/azure-synapse-analytics/msg-10519-when-attempting-to-access-external-table-via-polybase/ba-p/690641