hadoop - 完成mapreduce工作后我们可以检查状态多长时间?

时间:2014-04-08 11:51:07

标签: java hadoop mapreduce

我有3个mapreduce作业要并行执行,因此,我做了类似的事情

Job[] job = new Job[3];
.
.
.
job[0].submit();
job[1].submit();
job[2].submit();

因此,为了检查所有工作是否成功,我对3个工作进行了一些轮询。

boolean isAllFinished = false;
while(!isAllFinished) {
    for(int i = 0; i < 3; i++) {
        log.debug("job["+i+"].isComplete() >> " + job[i].isComplete());
        isAllFinished = isAllFinished & job[i].isComplete();
    }
    Thread.sleep(1000);
}

虽然,这个收益率肯定,它在随机的情况下失败,我会有这种错误/日志:

14/04/08 18:43:59 DEBUG FMS: job[0].isComplete() >> false
14/04/08 18:44:00 DEBUG FMS: job[1].isComplete() >> false
14/04/08 18:44:01 DEBUG FMS: job[2].isComplete() >> false
14/04/08 18:44:12 INFO mapred.ClientServiceDelegate: Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
14/04/08 18:44:13 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
14/04/08 18:44:14 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
14/04/08 18:44:15 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)

我在想,如果我们可以通过Job类检查作业状态,那么是否有超时。感谢对此的任何想法!

1 个答案:

答案 0 :(得分:1)

可以将Hadoop作业配置为使用job.waitForCompletion(true);

等待完成

您可以尝试以下代码:

job[0].waitForCompletion(true);
job[1].waitForCompletion(true);
job[2].waitForCompletion(true);

如果您想提供超时,则可以使用wait(timeout)方法。

job[0].wait(1000);