下面是我的代码:
val spark = SparkSession.builder().master("local[*]").appName("demoApp").getOrCreate()
spark.sparkContext.hadoopConfiguration.set("fs.azure", "org.apache.hadoop.fs.azure.NativeAzureFileSystem")
spark.sparkContext.hadoopConfiguration.set("fs.azure.account.key.<storage-account>.blob.core.windows.net", <account_key>)
val baseDir = "wasb://<container-name>@<storage_account>.blob.core.windows.net/"
val df = spark.read.orc(baseDir+"path")
错误:
org.apache.spark.sql.AnalysisException: Path does not exist wasb://<container-name>@<storage_account>.blob.core.windows.net/path
答案 0 :(得分:1)
我建议检查以下documentation。
他们提供了有关如何使用标准spark API和databricks API从Blob存储帐户读取数据的示例,显示的代码如下:
val df = spark.read.parquet("wasbs://<your-container-name>@<your-storage-account-name>.blob.core.windows.net/<your-directory-name>")
dbutils.fs.ls("wasbs://<your-container-name>@<your-storage-account-name>.blob.core.windows.net/<your-directory-name>")