从reducer中只获取一个聚合值

时间:2015-04-21 12:16:43

标签: java hadoop mapreduce weka

我正在尝试使用Weka打印J48分类器的混淆矩阵。我得到的输出是每个映射器的矩阵数。运行的映射器数量设置为2。

此类是weka分类器输出的简化器。它由映射器提供一组交叉验证的数据块,其工作是将数据聚合到一个解决方案中。

public void reduce(Text key, Iterable<AggregateableEvaluation> values, Context context) throws IOException, InterruptedException {      
        int sum = 0;                    
        // loop through each of the values and "aggregate"
        // which basically means to consolidate the values
        for (AggregateableEvaluation val : values) {
            System.out.println("IN THE REDUCER!");

            // The first time through, give aggEval a value
            if (sum == 0) {
                try {
                    aggEval = val;
                }
                catch (Exception e) {
                    e.printStackTrace();
                }
            }
            else {
                // combine the values
                aggEval.aggregate(val);
            }

            try {
                // This is what is taken from the mapper to be aggregated
                //System.out.println("This is the map result");
                //System.out.println(aggEval.toMatrixString());
            }
            catch (Exception e) {
                e.printStackTrace();
            }                       

            sum += 1;
        }           
        try {
            System.out.println("This is reduce matrix");
            System.out.println(aggEval.toMatrixString());
        }
        catch (Exception e) {
            e.printStackTrace();
        }

1 个答案:

答案 0 :(得分:0)

我对WEKA一无所知,但是对于“普通”mapreduce,你的reduce函数应该是以下形式:https://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapreduce/Reducer.html

 public class IntSumReducer<Key> extends Reducer<Key,IntWritable,
                                                 Key,IntWritable> {
   private IntWritable result = new IntWritable();

   public void reduce(Key key, Iterable<IntWritable> values,
                      Context context) throws IOException, InterruptedException {
     int sum = 0;
     for (IntWritable val : values) {
       sum += val.get();
     }
     result.set(sum);
     context.write(key, result);
   }
 }

所以基本上,为每个 一次调用Reducer方法。您将获得映射到该特定键的所有值,您应该将它们聚合在一起,然后在完成后执行context.write(key, aggEval)以从reduce方法中发出结果