我正在尝试使用databricks笔记本将存储在蔚蓝斑点中的一系列json文件读入spark。我用我的帐户和密钥设置了conf(),但是它总是返回错误
shaded.databricks.org.apache.hadoop.fs.azure.AzureException: java.lang.IllegalArgumentException: The String is not a valid Base64-encoded string.
我一直关注这里提供的信息:
https://docs.databricks.com/_static/notebooks/data-import/azure-blob-store.html
在这里:
https://luminousmen.com/post/azure-blob-storage-with-pyspark
我可以使用适用于python的azure sdk很好地提取数据
storage_account_name = "name"
storage_account_access_key = "key"
spark.conf.set(
"fs.azure.account.key."+storage_account_name+".blob.core.windows.net",
storage_account_access_key)
file_location = "wasbs://loc/locationpath"
file_type = "json"
df = spark.read.format(file_type).option("inferSchema", "true").load(file_location)
应返回json文件的数据框