我一直在玩MRUnit并尝试按照wordcount和unit testing
的教程运行一个hadoop wordcount示例虽然不是粉丝,但我一直在使用Eclipse来运行代码,并且我一直在为setMapper函数收到错误
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mrunit.mapreduce.MapDriver;
import org.apache.hadoop.mrunit.mapreduce.MapReduceDriver;
import org.apache.hadoop.mrunit.mapreduce.ReduceDriver;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.junit.Before;
import org.junit.Test;
public class TestWordCount {
MapReduceDriver<LongWritable, Text, Text, IntWritable, Text, IntWritable> mapReduceDriver;
MapDriver<LongWritable, Text, Text, IntWritable> mapDriver;
ReduceDriver<Text, IntWritable, Text, IntWritable> reduceDriver;
@Before
public void setUp() throws IOException
{
WordCountMapper mapper = new WordCountMapper();
mapDriver = new MapDriver<LongWritable, Text, Text, IntWritable>();
mapDriver.setMapper(mapper); //<--Issue here
WordCountReducer reducer = new WordCountReducer();
reduceDriver = new ReduceDriver<Text, IntWritable, Text, IntWritable>();
reduceDriver.setReducer(reducer);
mapReduceDriver = new MapReduceDriver<LongWritable, Text, Text, IntWritable, Text, IntWritable>();
mapReduceDriver.setMapper(mapper); //<--Issue here
mapReduceDriver.setReducer(reducer);
}
错误讯息:
java.lang.Error: Unresolved compilation problems:
The method setMapper(Mapper<LongWritable,Text,Text,IntWritable>) in the type MapDriver<LongWritable,Text,Text,IntWritable> is not applicable for the arguments (WordCountMapper)
The method setMapper(Mapper<LongWritable,Text,Text,IntWritable>) in the type MapReduceDriver<LongWritable,Text,Text,IntWritable,Text,IntWritable> is not applicable for the arguments (WordCountMapper)
查看这个问题,我认为这可能是一个API冲突,但我不确定在哪里寻找它。其他人之前有这个问题吗?
编辑我使用了一个用户定义的库,其中包含hadoop2 jar和最新的Junit(4.10)jar。
EDIT 2 以下是WordCountMapper的代码
import java.io.IOException;
import java.util.StringTokenizer;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;
public class WordCountMapper extends Mapper<Object, Text, Text, IntWritable>
{
private final static IntWritable one = new IntWritable(1);
private Text word = new Text();
public void map(Object key, Text value, Context context)throws IOException, InterruptedException
{
StringTokenizer itr = new StringTokenizer(value.toString());
while (itr.hasMoreTokens())
{
word.set(itr.nextToken());
context.write(word, one);
}
}
}
最终编辑/工作
原来我需要设置
WordCountMapper mapper = new WordCountMapper();
到
Mapper mapper = new WordCountMapper();
因为泛型存在问题。还需要将mockito库导入我的用户定义库。
答案 0 :(得分:2)
这是你的问题
public class WordCountMapper extends Mapper<Object, Text, Text, IntWritable>
....
MapDriver<LongWritable, Text, Text, IntWritable> mapDriver;
您的WordCountMapper
输入类型(Object
)与MapDriver
输入类型(LongWritable
)不兼容。将您的Mapper
定义更改为
class WordCountMapper extends Mapper<LongWritable, Text, Text, IntWritable>
您可能希望将map
方法参数从Object key
更改为LongWritable key
。
答案 1 :(得分:2)
确保你导入了正确的类,我遇到了同样的错误,不像上面我的程序在Reducer和reduce_test这两个类中都有正确的参数但由于导入了错误的类,我遇到了上面报告的相同的错误信息
错误导入类 -
import org.apache.hadoop.mrunit.ReduceDriver;
正确的课程---
import org.apache.hadoop.mrunit.mapreduce.ReduceDriver;
如果你确定你的参数在Mapper__class和Mapper_test 中是相同的,那么在mapper_test的情况下相同的解决方案