我正在尝试使用Cloudera hadoop distrubution开发mr-job。我正在使用api版本2。 我确实遇到了先生单位的麻烦。请建议做什么。我已经使用了标准的arhetype并完全丢失了,我不知道问题的腐烂在哪里。 以下是我的依赖项:
<dependency>
<groupId>com.cloudera.hadoop</groupId>
<artifactId>hadoop-core</artifactId>
<version>0.20.2-320</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>com.cloudera.hadoop</groupId>
<artifactId>hadoop-mrunit</artifactId>
<version>0.20.2-320</version>
<scope>test</scope>
</dependency>
这是我的测试代码:
@Test
public void testEmptyOutput() throws Exception{
for(String line : linesFromFlatFile){
//List<Pair<GetReq, IntWritable>> output =
driver.withInput(UNUSED_LONG_KEY, new Text(line) )
// .withOutput(null, null )
.run();
//assertTrue("", output.isEmpty());
}
}
这是一个例外:
> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.283
> sec <<< FAILURE!
> testEmptyOutput(MapperTest)
> Time elapsed: 0.258 sec <<< ERROR! java.lang.NoSuchMethodError:
> org.apache.hadoop.mapreduce.TaskAttemptID.<init>(Ljava/lang/String;IZII)V
> at
> org.apache.hadoop.mrunit.mapreduce.mock.MockMapContextWrapper$MockMapContext.<init>(MockMapContextWrapper.java:71)
> at
> org.apache.hadoop.mrunit.mapreduce.mock.MockMapContextWrapper.getMockContext(MockMapContextWrapper.java:144)
> at
> org.apache.hadoop.mrunit.mapreduce.MapDriver.run(MapDriver.java:197)
> at
MapperTest.testEmptyOutput(ScoringCounterMapperTest.java:42)
package mypackage;
import java.util.Date;
import java.util.List;
import junit.framework.TestCase;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mrunit.mapreduce.MapDriver;
import org.apache.hadoop.mrunit.types.Pair;
import org.junit.Before;
import org.junit.Test;
import Sample;
import GetReq;
public class MapperTest extends TestCase {
private static final IntWritable ONE_OCCURANCE = new IntWritable(1);
private static final LongWritable UNUSED_LONG_KEY = new LongWritable(new Date().getTime());
private Mapper<LongWritable, Text, GetReq, IntWritable> mapper;
private MapDriver<LongWritable, Text, GetReq, IntWritable> driver;
List<String> linesFromFlatFileNoOutput = null;
List<String> linesFromFlatFileWithOutput = null;
@Before
public void setUp() {
mapper = newMapper();
driver = new MapDriver<LongWritable, Text, GetReq, IntWritable>(mapper);
Mapper.METADATA_CSV ="../../data/metadata.csv"; //ugly hook
linesFromFlatFileNoOutput = Sample.instance.getLinesFromFlatFileNoOutput();
linesFromFlatFileWithOutput = Sample.instance.getLinesFromFlatFileWithOutput();
}
@Test
public void testEmptyOutput() throws Exception{
for(String line : linesFromFlatFileNoOutput){
//List<Pair<GetReq, IntWritable>> output =
driver.withInput(UNUSED_LONG_KEY, new Text(line) )
.withOutput(null, null )
.runTest();
//assertTrue("", output.isEmpty());
}
}
@Test
public void testResultOutput() throws Exception{
for(String line : linesFromFlatFileWithOutput){
driver.withInput(UNUSED_LONG_KEY, new Text(line) )
//.withOutput(null, null )
.runTest();
}
}
}
嗯......我没有改变pom.xml中的任何内容 现在我得到输出和相同的例外。看起来像mapper运行一次。或rties运行。我从mapper body获得调试输出。
UPD:我添加了分类符并更改了依赖项:
<dependency>
<groupId>org.apache.mrunit</groupId>
<artifactId>mrunit</artifactId>
<version>0.9.0-incubating</version>
<classifier>hadoop2</classifier>
<scope>test</scope>
</dependency>
现在我又遇到了另一个问题:
找到接口org.apache.hadoop.mapreduce.Counter,但是类是 预期
在线:
context.getCounter(EnumCounter.MATCHED_RECORDS).increment(1);
我又怎么做错了?
答案 0 :(得分:1)
我找到了解决方案:Nee为mr-unit添加分类器标签。它应该看起来像:
<dependency>
<groupId>org.apache.mrunit</groupId>
<artifactId>mrunit</artifactId>
<version>0.9.0-incubating</version>
<classifier>hadoop2</classifier>
<scope>test</scope>
</dependency>
现在我还有另外一个问题:找到接口org.apache.hadoop.mapreduce.Counter,但是在计数器增量上,类是预期的。这个问题与一些错误有关。
答案 1 :(得分:0)
在将错误版本的依赖项拉入运行时之前,我已经看到过这种问题。我之前有两件事已经解决了这个问题: