org.apache.spark.sql.AnalysisException:尝试从Spark访问Azure时路径不存在

时间:2018-07-20 08:35:24

标签: azure apache-spark azure-storage-blobs

下面是我的代码:

val spark = SparkSession.builder().master("local[*]").appName("demoApp").getOrCreate()
spark.sparkContext.hadoopConfiguration.set("fs.azure", "org.apache.hadoop.fs.azure.NativeAzureFileSystem")
    spark.sparkContext.hadoopConfiguration.set("fs.azure.account.key.<storage-account>.blob.core.windows.net", <account_key>)

val baseDir = "wasb://<container-name>@<storage_account>.blob.core.windows.net/"

val df = spark.read.orc(baseDir+"path")

错误:

org.apache.spark.sql.AnalysisException: Path does not exist wasb://<container-name>@<storage_account>.blob.core.windows.net/path

1 个答案:

答案 0 :(得分:1)

我建议检查以下documentation

他们提供了有关如何使用标准spark API和databricks API从Blob存储帐户读取数据的示例,显示的代码如下:

val df = spark.read.parquet("wasbs://<your-container-name>@<your-storage-account-name>.blob.core.windows.net/<your-directory-name>")

dbutils.fs.ls("wasbs://<your-container-name>@<your-storage-account-name>.blob.core.windows.net/<your-directory-name>")