如何将数据从挂载点数据块复制到 ADLS gen2

时间:2021-04-14 23:19:01

标签: azure-databricks azure-data-lake-gen2

我正在尝试将 /mnt/Demo 文件夹中的数据写入 adls gen2,您能否帮助执行此操作的步骤。到目前为止,我可以执行以下代码行,并且可以将数据从 adls 复制到 /mnt/demo 文件夹并从中读取数据。如何通过databricks向adls写入数据

configs = {"fs.azure.account.auth.type": "OAuth",
       "fs.azure.account.oauth.provider.type": "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider",
       "fs.azure.account.oauth2.client.id": "", #Enter <appId> = Application ID
       "fs.azure.account.oauth2.client.secret": "", #Enter <password> = Client Secret created in AAD
       "fs.azure.account.oauth2.client.endpoint": "https://login.microsoftonline.com/cccc/oauth2/token", #Enter <tenant> = Tenant ID
       "fs.azure.createRemoteFileSystemDuringInitialization": "true"}

dbutils.fs.mount(
source = "abfss://yy@xxx.dfs.core.windows.net/Test2", #Enter <container-name> = filesystem name <storage-account-name> = storage name
mount_point = "/mnt/Demo17",
extra_configs = configs)
df = spark.read.csv("/mnt/Demo16/Contract.csv", header="true")
df_review = df[['AccountId', 'Id', 'Contract_End_Date_2__c', 'Contract_Type__c', 'StartDate', 'Contract_Term_Type__c', 'Status', 'Description', 'CreatedDate', 'LastModifiedDate']]
df_review.repartition(1).write.mode("append").csv("abfss://salesforcedata@storagedemovs.dfs.core.windows.net/Test2/trial")
display(df_review)
display(df)

0 个答案:

没有答案