Question

我正在尝试将数据从天蓝色blob存储加载到delta lake中。我正在使用以下代码段

storage_account_name =“ xxxxxxxxdev” storage_account_access_key =“ xxxxxxxxxxxxxxxxxxxxx”

file_location =“ wasbs：//bicc-hdspk-eus-qc@xxxxxxxxdev.blob.core.windows.net/FSHC/DIM/FSHC_DIM_SBU”

file_type =“ csv”

spark.conf.set（“ fs.azure.account.key。” + storage_account_name +“。blob.core.windows.net”，storage_account_access_key）

df = spark.read.format（file_type）.option（“ header”，“ true”）。option（“ inferSchema”，“ true”）。option（“ delimiter”，'|'）。load（file_location ）

dx = df.write.format（“ parquet”）

直到这一步它都可以正常工作，而且我也可以将其加载到databricks表中。

dx.write.format（“ delta”）。save（file_location）

错误：AttributeError：'DataFrameWriter'对象没有属性'write'

p.s。 -我是否将文件位置错误传递到write语句中？如果这是原因，那么Delta Lake的文件路径是什么。

如果需要其他信息，请回复给我。

谢谢，阿比鲁普（Abhirup）

Answer 1

dx是一个dataframewriter，所以您尝试执行的操作没有任何意义。您可以这样做：

df = spark.read.format(file_type).option("header","true").option("inferSchema", "true").option("delimiter", '|').load(file_location)

df.write.format("parquet").save()
df.write.format("delta").save()

从天蓝色的Blob存储中将数据加载到三角洲湖泊中

直到这一步它都可以正常工作，而且我也可以将其加载到databricks表中。

1 个答案: