spark为azure指定多个罐子

时间:2017-09-22 14:54:23

标签: azure apache-spark azure-storage

我试图从我的spark-shell访问azure blob,但得到以下错误 -

scala> sc.textFile("wasb://mycontainer@test.blob.core.windows.net/testfolder/txtfile").count()
java.lang.NoClassDefFoundError: com/microsoft/azure/storage/StorageException
  at org.apache.hadoop.fs.azure.NativeAzureFileSystem.createDefaultStore(NativeAzureFileSystem.java:1064)
  at org.apache.hadoop.fs.azure.NativeAzureFileSystem.initialize(NativeAzureFileSystem.java:1035)
  at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2397)
  at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:89)
  at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2431)\

jars指令中的多个jar

Ps-M:spark-2.1.1-bin-hadoop2.7 p$ bin/spark-shell --jars "/Users/p/Documents/ba/spark-tutorial/spark-2.1.1-bin-hadoop2.7/jars/hadoop-azure-2.7.0.jar" "/Users/p/Documents/ba/spark-tutorial/spark-2.1.1-bin-hadoop2.7/jars/azure-storage-2.0.0.jar"

我想知道如何在--jars指令中指定多个jar,现在我提到了" jar1" " jar2"

1 个答案:

答案 0 :(得分:0)

多个JAR以逗号分隔。

尝试运行

Ps-M:spark-2.1.1-bin-hadoop2.7 p$ bin/spark-shell --jars "/Users/p/Documents/ba/spark-tutorial/spark-2.1.1-bin-hadoop2.7/jars/hadoop-azure-2.7.0.jar,/Users/p/Documents/ba/spark-tutorial/spark-2.1.1-bin-hadoop2.7/jars/azure-storage-2.0.0.jar"

另外,请确保com/microsoft/azure/storage/StorageException类在上面的JARS中。您可以运行以下内容来检查:

jar -tvf /Users/p/Documents/ba/spark-tutorial/spark-2.1.1-bin-hadoop2.7/jars/hadoop-azure-2.7.0.jar | grep -i com.microsoft.azure.storage.StorageException

jar -tvf /Users/p/Documents/ba/spark-tutorial/spark-2.1.1-bin-hadoop2.7/jars/azure-storage-2.0.0.jar | grep -i com.microsoft.azure.storage.StorageException

如果在上面的2 JARS中找不到StorageException类,那么查找具有该类的JARS并添加到--jars选项。

希望这有帮助