Question

我在hadoop中运行一个简单的map reduce作业，在java中我可以使用System.currentTimeInMillis()函数计算开始时间和结束时间，在mapreduce中如何为map执行此功能（endTime-startTime），reduce（endTime-startTime）。我尝试了以下代码...和set job.setNumReduceTasks(4)

编辑：

public void reduce(Text _key, Iterable<IntWritable> values, Context context)
            throws IOException, InterruptedException {
        // process values
        long start=System.currentTimeMillis();
        int sum=0;

        for (IntWritable val : values) {

            sum+=val.get();

        }
        result.set(sum);
        context.write(_key, result);
        long end=System.currentTimeMillis();

        System.out.println(" time Taken "+(end-start));


    }

但结果是：

time Taken 1
 time Taken 0
 time Taken 0
 time Taken 0
 time Taken 0
 time Taken 0
 time Taken 0
 time Taken 0
 time Taken 0
 time Taken 0
 ----------
 ----------

但是我将reduce任务的数量设置为4 ..并且它显示了执行每个键值对所花费的时间..

添加setup（）方法和cleanup（）方法后..

public void run(Context context) throws IOException, InterruptedException {
        start=System.currentTimeMillis();
        setup(context);
        try {
          while (context.nextKey()) {
            reduce(context.getCurrentKey(), context.getValues(), context);
          }
        } finally {
          cleanup(context);
          end=System.currentTimeMillis();
          System.out.println(" End- Start : "+(end-start));
        }
      }

    public void reduce(Text _key, Iterable<IntWritable> values, Context context)
            throws IOException, InterruptedException {

        int sum=0;

        for (IntWritable val : values) {

            sum+=val.get();

        }
        result.set(sum);
        context.write(_key, result);

    }

我使用job.setNumReduceTasks(4)将reducer的数量设置为4。但它只显示一个时间戳..我在这里做错了什么......

Answer 1

要查找减速机的总时间，您可以：

将long变量添加到将保留开始时间的类中。
使用reducer的setup()方法设置开始时间。
在减速器的cleanup()方法中获取结束时间，并从存储的开始时间中减去以获得总时间。

如何在Hadoop中以编程方式获得每个reduce任务的执行时间？

1 个答案: