我试图找到一种解决方案,但是什么也没有。我是新手,所以如果您知道解决方案,请帮助我。 谢谢!
答案 0 :(得分:1)
Ok, I found a solution.
#copy file from ADLS to SFTP
from ftplib import FTP_TLS
from azure.datalake.store import core, lib, multithread
import pandas as pd
keyVaultName = "yourkeyvault"
#then you need to configure keyvault with ADLS
#set up authentification for ADLS
tenant_id = dbutils.secrets.get(scope = keyVaultName, key = "tenantId")
username = dbutils.secrets.get(scope = keyVaultName, key = "appRegID")
password = dbutils.secrets.get(scope = keyVaultName, key = "appRegSecret")
store_name = 'ADLSStoridge'
token = lib.auth(tenant_id = tenant_id, client_id = username, client_secret = password)
adl = core.AzureDLFileSystem(token, store_name=store_name)
#create secure connection with SFTP
ftp = FTP_TLS('ftp.xyz.com')
#add credentials
ftp.login(user='',passwd='')
ftp.prot_p()
#set sftp directory path
ftp.cwd('folder path on FTP')
#load file
f = adl.open('adls path of your file')
#write to SFTP
ftp.storbinary('STOR myfile.csv', f)
答案 1 :(得分:0)
在Databricks中,您可以使用以下任意一种方法访问ADLS中存储的文件。 有三种访问Azure Data Lake Storage Gen2的方法:
在文件系统中挂载和访问文件的步骤,就像它们是本地文件一样:
要在容器中安装Azure Data Lake Storage Gen2或文件夹,请使用以下命令:
语法:
configs = {"fs.azure.account.auth.type": "OAuth",
"fs.azure.account.oauth.provider.type": "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider",
"fs.azure.account.oauth2.client.id": "<appId>",
"fs.azure.account.oauth2.client.secret": "<password>",
"fs.azure.account.oauth2.client.endpoint": "https://login.microsoftonline.com/<tenant>/oauth2/token",
"fs.azure.createRemoteFileSystemDuringInitialization": "true"}
dbutils.fs.mount(
source = "abfss://<container-name>@<storage-account-name>.dfs.core.windows.net/folder1",
mount_point = "/mnt/flightdata",
extra_configs = configs)
示例:
安装ADLS之后,您可以像访问本地文件一样访问文件系统,例如:
df = spark.read.csv("/mnt/flightdata/flightdata.csv", header="true")
display(df)
示例:
参考: Databricks - Azure Data Lake Storage Gen2。
希望这会有所帮助。