我正在尝试使用map reduce进行迭代。
我有3 sequence
正在运行的作业
static enum UpdateCounter {
INCOMING_ATTR
}
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
int res = ToolRunner.run(conf, new Driver(), args);
System.exit(res);
}
@Override
public int run(String[] args) throws Exception {
while(counter >= 0){
Configuration conf = getConf();
/*
* Job 1:
*/
Job job1 = new Job(conf, "");
//other configuration
job1.setMapperClass(ID3ClsLabelMapper.class);
job1.setReducerClass(ID3ClsLabelReducer.class);
Path in = new Path(args[0]);
Path out1 = new Path(CL);
if(counter == 0){
FileInputFormat.addInputPath(job1, in);
}
else{
FileInputFormat.addInputPath(job1, out5);
}
FileInputFormat.addInputPath(job1, in);
FileOutputFormat.setOutputPath(job1,out1);
job1.waitForCompletion(true);
/*
* Job 2:
*
*/
Configuration conf2 = getConf();
Job job2 = new Job(conf2, "");
Path out2 = new Path(ANC);
FileInputFormat.addInputPath(job2, in);
FileOutputFormat.setOutputPath(job2,out2);
job2.waitForCompletion(true);
/*
* Job3
*/
Configuration conf3 = getConf();
Job job3 = new Job(conf3, "");
System.out.println("conf3");
Path out5 = new Path(args[1]);
if(fs.exists(out5)){
fs.delete(out5, true);
}
FileInputFormat.addInputPath(job3,out2);
FileOutputFormat.setOutputPath(job3,out5);
job3.waitForCompletion(true);
FileInputFormat.addInputPath(job3,new Path(args[0]));
FileOutputFormat.setOutputPath(job3,out5);
job3.waitForCompletion(true);
counter = job3.getCounters().findCounter(UpdateCounter.INCOMING_ATTR).getValue();
}
return 0;
Job 3 reducer
public class ID3GSReducer extends Reducer<NullWritable, Text, NullWritable, Text>{
public static final String UpdateCounter = null;
NullWritable out = NullWritable.get();
public void reduce(NullWritable key,Iterable<Text> values ,Context context) throws IOException, InterruptedException{
for(Text val : values){
String v = val.toString();
context.getCounter(UpdateCounter.INCOMING_ATTR).increment(1);
context.write(out, new Text(v));
}
}
}
但显示
14/06/12 10:12:30 INFO mapred.JobClient: Virtual memory (bytes) snapshot=0
14/06/12 10:12:30 INFO mapred.JobClient: Total committed heap usage (bytes)=1238630400
conf3
Exception in thread "main" java.lang.IllegalStateException: Job in state DEFINE instead of RUNNING
at org.apache.hadoop.mapreduce.Job.ensureState(Job.java:116)
at org.apache.hadoop.mapreduce.Job.getCounters(Job.java:491)
现在如何迭代上述工作?
整个3个工作应该有效,直到INCOMING_ATTR == 0
而job3的输出 - args[1]
应该是job 1
second iteration
的输入。为了做到这一点,我应该改变一切。
请建议。
我做错了什么。