没有用于方案的文件系统:Hadoop 3.2.1中的wasbs错误

时间:2019-12-26 22:06:05

标签: apache-spark hadoop azure-storage-blobs

我有一个在k8s上运行的Spark集群(v 2.4.0)(现在是AKS)。我正在尝试使用以下架构读取Spark数据框中的文件

s"wasbs://${container}@${account}.blob.core.windows.net/${prefix}"

在我的Scala代码中,我确实按照如下方式设置了hadoop配置

    sparkContext.getConf.set(s"fs.azure.account.key.${account}.blob.core.windows.net", accountKey) sparkContext.hadoopConfiguration.set(s"fs.azure.account.key.${account}.dfs.core.windows.net", accountKey)sparkContext.hadoopConfiguration.set(s"fs.azure.account.key.${account}.blob.core.windows.net", accountKey)

在项目build.sbt中使用时,我能够读取文件

"org.apache.hadoop" % "hadoop-azure" % "2.7.3"

但是当我尝试使用最新的版本3.2.1时,此stasktrace出现以下错误No FileSystem for scheme: wasbs

org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2660)
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2667)
org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94)
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703)
org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685)
org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373)
org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
org.apache.spark.sql.execution.datasources.DataSource$$anonfun$org$apache$spark$sql$execution$datasources$DataSource$$checkAndGlobPathIfNecessary$1.apply(DataSource.scala:547)
org.apache.spark.sql.execution.datasources.DataSource$$anonfun$org$apache$spark$sql$execution$datasources$DataSource$$checkAndGlobPathIfNecessary$1.apply(DataSource.scala:545)
scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
scala.collection.immutable.List.foreach(List.scala:392)
scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241)
scala.collection.immutable.List.flatMap(List.scala:355)
org.apache.spark.sql.execution.datasources.DataSource.org$apache$spark$sql$execution$datasources$DataSource$$checkAndGlobPathIfNecessary(DataSource.scala:545)
org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:359)
org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:223)
org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:211)
org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:178)

0 个答案:

没有答案