我正在尝试运行一个Mahout项目,我在Eclipse中使用Mahout和Hadoop库编写了这个项目。它加载到数据集中并运行FPGrowth算法。我设置了以下运行配置来运行项目:
mvn exec:java -Dexec.mainClass=com.patternmatching.RecommendApp.TopPatternMatches
运行程序后,我收到以下错误消息:
WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
我研究了这个问题,并意识到必须编译或从Apache(Hadoop "Unable to load native-hadoop library for your platform" warning)下载Native hadoop库。我在Cloudera Quickstart VM上下载了这些库,我在其上设置了Mahout和Maven以及我的项目包。在cloudera中运行后,我得到了同样的错误。我还运行了Hadoop checknative -a
命令,该命令验证Native库是否可用:
[root@quickstart /]# hadoop checknative -a
16/10/22 19:32:16 INFO bzip2.Bzip2Factory: Successfully loaded & initialized native-bzip2 library system-native
16/10/22 19:32:16 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
Native library checking:
hadoop: true /usr/lib/hadoop/lib/native/libhadoop.so.1.0.0
zlib: true /lib64/libz.so.1
snappy: true /usr/lib/hadoop/lib/native/libsnappy.so.1
lz4: true revision:99
bzip2: true /lib64/libbz2.so.1
openssl: true /usr/lib64/libcrypto.so
命令的输出验证库是否可用,但未正确加载到程序或类路径中。我不确定如何配置Maven,以便在运行程序时将其加载到Hadoop本机库中。这是Maven pom.xml
文件的依赖项部分:
<dependencies>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-core</artifactId>
<version>1.2.1</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>3.0.0-alpha1</version>
</dependency>
<dependency>
<groupId>org.apache.mahout</groupId>
<artifactId>mahout-core</artifactId>
<version>0.9</version>
</dependency>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
</dependencies>
我运行以执行我的Mahout java程序的命令是
mvn exec:java -Dexec.mainClass=com.patternmatching.RecommendApp.TopPatternMatches
如何配置Maven以查看这些库以便在程序中使用它们?