将Blob连接到Spark失败

时间:2016-08-29 14:57:52

标签: apache-spark blob azure-storage hortonworks-data-platform

我正在尝试将Hortonworks 2.4发行版中的spark连接到Azure Blob存储,我收到错误" wasb文件系统无法识别" 。我查了一下,许多人建议下载azure-sdk-for-java并创建包。

我正在尝试使用mvn创建包,但同时运行mvn test和mvn包;这个过程在测试时被绞死了。没有给出错误,它只是没有返回任何东西。我更改了testconfiguration.xml以反映我的blob帐户名称。下面是我得到的日志文件。

是否需要做其他事情来获取罐子?

[INFO] Scanning for projects...
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] Building Microsoft Azure Storage Client SDK 4.3.0
[INFO] ------------------------------------------------------------------------
[INFO]
[INFO] --- maven-resources-plugin:2.5:resources (default-resources) @ azure-storage ---
[debug] execute contextualize
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory /home/adminhorton/kspark/azure-storage-java-master/src/main/resources
[INFO]
[INFO] --- maven-compiler-plugin:2.3.2:compile (default-compile) @ azure-storage ---
[INFO] Nothing to compile - all classes are up to date
[INFO]
[INFO] --- maven-resources-plugin:2.5:testResources (default-testResources) @ azure-storage ---
[debug] execute contextualize
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 2 resources
[INFO]
[INFO] --- maven-compiler-plugin:2.3.2:testCompile (default-testCompile) @ azure-storage ---
[INFO] Compiling 52 source files to /home/adminhorton/kspark/azure-storage-java-master/target/test-classes
[INFO]
[INFO] --- maven-surefire-plugin:2.13:test (default-test) @ azure-storage ---
[INFO] Surefire report directory: /home/adminhorton/kspark/azure-storage-java-master/target/surefire-reports
T E S T S

parallel='classes', perCoreThreadCount=true, threadCount=2, useUnlimitedThreads=false
Running com.microsoft.azure.storage.StorageAccountTests
Tests run: 33, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.057 sec
Running com.microsoft.azure.storage.StorageUriTests
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.071 sec

com.microsoft.azure.storage.StorageAccountTests.txt
com.microsoft.azure.storage.StorageUriTests.txt

com.microsoft.azure.storage.StorageAccountTests.txt
com.microsoft.azure.storage.StorageUriTests.txt

1 个答案:

答案 0 :(得分:0)

根据您的说明,我不确定您是否已在项目中正确配置了存储配置。因此,我建议您按照以下步骤检查配置:

1.创建Azure存储帐户并将其设置为core-site.xml格式:

<property>
  <name>fs.azure.account.key.youraccount.blob.core.windows.net</name>
  <value>YOUR ACCESS KEY</value>
</property>

2.重新启动HDP服务并使用Hadoop fs –ls wasb://**.blob.core.windows.net/列出容器中的文件。

我强烈建议您参考此博客how-to-configure-hortonworks-hdp-to-access-azure-windows-storage和官方document

与此同时,@ Yuval提供了有关如何使用Java SDK连接Azure blob的绝佳示例。