我希望能够在我的MR作业的地图阶段设置某种变量或标记,我可以在>>作业完成后检查。我认为用一些代码演示我想要的东西的最好方法是:p.s我正在使用Hadoop 2.2.0
public class MRJob {
public static class MapperTest
extends Mapper<Object, Text, Text, IntWritable>{
public void map(Object key, Text value, Context context
) throws IOException, InterruptedException {
//Do some computation to get new value and key
...
//Check if new value equal to some condition e.g if(value < 1) set global variable to true
context.write(newKey, newValue);
}
}
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
Job job = Job.getInstance(new Configuration(), "word_count");
//set job configs
job.waitForCompletion(true);
//Here I want to be able to check if my global variable has been set to true by any one of the mappers
}
}
答案 0 :(得分:2)
就此问题使用Counter
。
public static enum UpdateCounter {
UPDATED
}
@Override
public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
if(value < 1) {
context.getCounter(UpdateCounter.UPDATED).increment(1);
}
context.write(newKey, newValue);
}
完成工作后,您可以查看:
Configuration conf = new Configuration();
Job job = Job.getInstance(new Configuration(), "word_count");
//set job configs
job.waitForCompletion(true);
long counter = job.getCounters().findCounter(UpdateCounter.UPDATED).getValue();
if(counter > 0)
// some mapper has seen the condition