以下是map reduce程序,其中在map函数中完成过滤,并在reduce步骤中完成求和。
地图部分执行正常。但是当reduce部分运行时,它会停留在 context.write(key,value)行。
只有当我尝试在reduce函数类型中编写不同于map函数中写入的输出时才会发生这种情况
public class Filter3 {
public static class TokenizerMapper extends Mapper<Object, Text, Text, Contestant>{
public void map(Object key, Text value, Context context) throws IOException, InterruptedException {
String[] cols = value.toString().split(",");
try {
Contestant val = new Contestant(cols[0],cols[1],cols[2]);
System.out.println();
System.out.println();
System.out.print(key+" ::: ");
System.out.println(val);
System.out.println();
System.out.println();
val.name = val.name.toUpperCase();
if(val.rating>=9) {
context.write(new Text(val.name), val); //write null if it is not required
}
} catch(Exception ex) {
ex.printStackTrace();
}
}
}
public static class AvgRatingReducer extends Reducer<Text,Contestant,Text,DoubleWritable> {
private DoubleWritable result = new DoubleWritable(0.0);
public void reduce(Text key, Iterable<Contestant> values, Context context ) throws IOException, InterruptedException {
double sum = 0.0;
int count = 0;
for (Contestant val : values) {
sum += val.rating;
count++;
}
if(count>0) {
result.set(sum/(double)count);
}
System.out.println(result);
context.write(key, result);
}
}
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
Job job = Job.getInstance(conf, "AvgMRJob"); //configuration and job name
job.setJarByClass(Filter3.class);
job.setMapperClass(TokenizerMapper.class);
job.setCombinerClass(AvgRatingReducer.class);
job.setReducerClass(AvgRatingReducer.class);
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(DoubleWritable.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(DoubleWritable.class);
Path inPath = new Path(args[0]);
Path outPath = new Path(args[1]);
outPath.getFileSystem(conf).delete(outPath,true);
FileInputFormat.addInputPath(job, inPath);
FileOutputFormat.setOutputPath(job, outPath);
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
使用的可写对象是:
public class Contestant implements Writable {
long id;
String name;
double rating;
public Contestant() {}
public Contestant(long id, String name, double rating) {
this.id = id;
this.name = name;
this.rating = rating;
}
public Contestant(String id, String name, String rating) {
try {
this.id = Long.parseLong(id.trim());
} catch(Exception ex) {
}
this.name = name;
try {
this.rating = Double.parseDouble(rating.trim());
} catch(Exception ex) {
}
}
@Override
public void readFields(DataInput inp) throws IOException {
id = inp.readLong();
name = WritableUtils.readString(inp);
rating = inp.readDouble();
}
@Override
public void write(DataOutput out) throws IOException {
out.writeLong(id);
WritableUtils.writeString(out, name);
out.writeDouble(rating);
}
@Override
public String toString() {
return this.id + "," + this.name + "," + this.rating;
}
}
将输出写入上下文时,执行会陷入reduce函数。我没有错误/异常。它只是无限期地挂起。 我无法确定问题是什么。我遵循了MapReduce的常规程序。
注意: 如果我在map和reduce中写入相同类型的数据,则相同的程序可以工作。即如果我在Map和Reduce函数中写(key = Text,val = Contestant)。 - 而不是在reduce !!中使用DoubleWritable
答案 0 :(得分:1)
删除合并器:
// job.setCombinerClass(AvgRatingReducer.class);
如果使用组合器,则需要确保reducer适用于组合器类的输出,而不是映射器。
答案 1 :(得分:0)
mapreduce组合输入<key,value>
对和输出<key,value>
对必须相同。这是组合器的规则,而对于reducer,此规则不存在
在这种情况下,reducer正在读取与映射器输出相同的<key,value>
对<Text,Contestant>
,并将<Text,DoubleWritable>
写为输出<key,val>
对。 / p>
因此,如果没有合成器,这将有效。在添加组合器时,我们必须确保输入<key,val>
对和输出<key,val>
对对于组合器步骤是相同的。
即<key1, value1, key1,value1>
这里,错误使用相同的reducer类作为组合器,因为不满足上述规则。 reducers输入<key,val>
对与输出<key,val>
对不同。