我在hadoop中运行一个简单的map reduce作业,在java中我可以使用System.currentTimeInMillis()
函数计算开始时间和结束时间,在mapreduce中如何为map执行此功能(endTime-startTime) ,reduce(endTime-startTime)。我尝试了以下代码...和set job.setNumReduceTasks(4)
编辑:
public void reduce(Text _key, Iterable<IntWritable> values, Context context)
throws IOException, InterruptedException {
// process values
long start=System.currentTimeMillis();
int sum=0;
for (IntWritable val : values) {
sum+=val.get();
}
result.set(sum);
context.write(_key, result);
long end=System.currentTimeMillis();
System.out.println(" time Taken "+(end-start));
}
但结果是:
time Taken 1
time Taken 0
time Taken 0
time Taken 0
time Taken 0
time Taken 0
time Taken 0
time Taken 0
time Taken 0
time Taken 0
----------
----------
但是我将reduce任务的数量设置为4 ..并且它显示了执行每个键值对所花费的时间..
添加setup()方法和cleanup()方法后..
public void run(Context context) throws IOException, InterruptedException {
start=System.currentTimeMillis();
setup(context);
try {
while (context.nextKey()) {
reduce(context.getCurrentKey(), context.getValues(), context);
}
} finally {
cleanup(context);
end=System.currentTimeMillis();
System.out.println(" End- Start : "+(end-start));
}
}
public void reduce(Text _key, Iterable<IntWritable> values, Context context)
throws IOException, InterruptedException {
int sum=0;
for (IntWritable val : values) {
sum+=val.get();
}
result.set(sum);
context.write(_key, result);
}
我使用job.setNumReduceTasks(4)
将reducer的数量设置为4。但它只显示一个时间戳..我在这里做错了什么......