我想实现一个maven项目,这有助于我对Hadoop MapReduce作业进行单元测试。我最大的问题是定义Maven依赖项以便能够使用测试类:MiniDFSCluster& MiniMRCluster。
我正在使用Hadoop 2.4.1。有什么想法吗?
答案 0 :(得分:6)
如果其他人仍在搜索答案:
MiniMRCluster现已弃用。
您可以在依赖项中获取MiniDFSCluster和MiniMRCluster(显示为Gradle)
compile group: 'org.apache.hadoop', name: 'hadoop-minicluster', version: '2.7.2'
依赖项基本上只是一个pom文件,列出了此包中的依赖项。对于那些想要查看的人,MiniDFSCluster位于工件hadoop-hdfs:tests
您不必使用Cloudera存储库中的依赖项
答案 1 :(得分:2)
<repositories>
<repository>
<id>cloudera</id>
<url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>
</repository>
</repositories>
然后将以下内容添加到项目依赖项
<dependency>
<groupId>commons-io</groupId>
<artifactId>commons-io</artifactId>
<version>2.1</version>
</dependency>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.11</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-auth</artifactId>
<version>2.0.0-cdh4.3.0</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-test</artifactId>
<version>2.0.0-mr1-cdh4.3.0</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
<version>2.0.0-cdh4.3.0</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
<version>2.0.0-cdh4.3.0</version>
<classifier>tests</classifier>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>2.0.0-cdh4.3.0</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>2.0.0-cdh4.3.0</version>
<classifier>tests</classifier>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-core</artifactId>
<version>2.0.0-mr1-cdh4.3.0</version>
</dependency>
如果有人有兴趣获得整个项目(着名的WordCount MapReduce工作的单元测试,我愿意分享它)