我在hadoop 1.0.3上的5节点集群上运行测试。该测试由3个工作链组成。第一份工作完美无缺。第二个作业接受第一个作业的输出(大约100MB)。在映射平滑地达到100%之后,作业陷入映射阶段和缩减阶段之间。将减少量降低到5%需要花费大量时间。这是完整的Hadoop输出和时间。
13/11/19 13:39:25 INFO mapred.JobClient: map 0% reduce 0%
13/11/19 13:40:12 INFO mapred.JobClient: map 1% reduce 0%
13/11/19 13:40:20 INFO mapred.JobClient: map 2% reduce 0%
13/11/19 13:40:24 INFO mapred.JobClient: map 3% reduce 0%
13/11/19 13:40:29 INFO mapred.JobClient: map 4% reduce 0%
13/11/19 13:40:39 INFO mapred.JobClient: map 5% reduce 0%
13/11/19 13:40:42 INFO mapred.JobClient: map 6% reduce 0%
13/11/19 13:40:51 INFO mapred.JobClient: map 7% reduce 0%
13/11/19 13:41:01 INFO mapred.JobClient: map 8% reduce 0%
13/11/19 13:41:06 INFO mapred.JobClient: map 9% reduce 0%
13/11/19 13:41:18 INFO mapred.JobClient: map 10% reduce 0%
13/11/19 13:41:22 INFO mapred.JobClient: map 11% reduce 0%
13/11/19 13:41:33 INFO mapred.JobClient: map 12% reduce 0%
13/11/19 13:41:42 INFO mapred.JobClient: map 13% reduce 0%
13/11/19 13:41:50 INFO mapred.JobClient: map 14% reduce 0%
13/11/19 13:41:55 INFO mapred.JobClient: map 15% reduce 0%
13/11/19 13:42:04 INFO mapred.JobClient: map 16% reduce 0%
13/11/19 13:42:11 INFO mapred.JobClient: map 17% reduce 0%
13/11/19 13:42:18 INFO mapred.JobClient: map 18% reduce 0%
13/11/19 13:42:29 INFO mapred.JobClient: map 19% reduce 0%
13/11/19 13:42:36 INFO mapred.JobClient: map 20% reduce 0%
13/11/19 13:42:42 INFO mapred.JobClient: map 21% reduce 0%
13/11/19 13:42:50 INFO mapred.JobClient: map 22% reduce 0%
13/11/19 13:42:57 INFO mapred.JobClient: map 23% reduce 0%
13/11/19 13:43:07 INFO mapred.JobClient: map 24% reduce 0%
13/11/19 13:43:17 INFO mapred.JobClient: map 25% reduce 0%
13/11/19 13:43:27 INFO mapred.JobClient: map 26% reduce 0%
13/11/19 13:43:37 INFO mapred.JobClient: map 27% reduce 0%
13/11/19 13:43:47 INFO mapred.JobClient: map 28% reduce 0%
13/11/19 13:43:54 INFO mapred.JobClient: map 29% reduce 0%
13/11/19 13:44:03 INFO mapred.JobClient: map 30% reduce 0%
13/11/19 13:44:12 INFO mapred.JobClient: map 31% reduce 0%
13/11/19 13:44:18 INFO mapred.JobClient: map 32% reduce 0%
13/11/19 13:44:28 INFO mapred.JobClient: map 33% reduce 0%
13/11/19 13:44:38 INFO mapred.JobClient: map 34% reduce 0%
13/11/19 13:44:48 INFO mapred.JobClient: map 35% reduce 0%
13/11/19 13:44:54 INFO mapred.JobClient: map 36% reduce 0%
13/11/19 13:45:02 INFO mapred.JobClient: map 37% reduce 0%
13/11/19 13:45:16 INFO mapred.JobClient: map 38% reduce 0%
13/11/19 13:45:21 INFO mapred.JobClient: map 39% reduce 0%
13/11/19 13:45:33 INFO mapred.JobClient: map 40% reduce 0%
13/11/19 13:45:39 INFO mapred.JobClient: map 41% reduce 0%
13/11/19 13:45:50 INFO mapred.JobClient: map 42% reduce 0%
13/11/19 13:45:58 INFO mapred.JobClient: map 43% reduce 0%
13/11/19 13:46:06 INFO mapred.JobClient: map 44% reduce 0%
13/11/19 13:46:17 INFO mapred.JobClient: map 45% reduce 0%
13/11/19 13:46:23 INFO mapred.JobClient: map 46% reduce 0%
13/11/19 13:46:32 INFO mapred.JobClient: map 47% reduce 0%
13/11/19 13:46:39 INFO mapred.JobClient: map 48% reduce 0%
13/11/19 13:46:44 INFO mapred.JobClient: map 49% reduce 0%
13/11/19 13:46:54 INFO mapred.JobClient: map 50% reduce 0%
13/11/19 13:47:01 INFO mapred.JobClient: map 51% reduce 0%
13/11/19 13:47:09 INFO mapred.JobClient: map 52% reduce 0%
13/11/19 13:47:20 INFO mapred.JobClient: map 53% reduce 0%
13/11/19 13:47:26 INFO mapred.JobClient: map 54% reduce 0%
13/11/19 13:47:36 INFO mapred.JobClient: map 55% reduce 0%
13/11/19 13:47:47 INFO mapred.JobClient: map 56% reduce 0%
13/11/19 13:47:59 INFO mapred.JobClient: map 57% reduce 0%
13/11/19 13:48:02 INFO mapred.JobClient: map 58% reduce 0%
13/11/19 13:48:14 INFO mapred.JobClient: map 59% reduce 0%
13/11/19 13:48:25 INFO mapred.JobClient: map 60% reduce 0%
13/11/19 13:48:37 INFO mapred.JobClient: map 61% reduce 0%
13/11/19 13:48:48 INFO mapred.JobClient: map 62% reduce 0%
13/11/19 13:48:56 INFO mapred.JobClient: map 63% reduce 0%
13/11/19 13:49:07 INFO mapred.JobClient: map 64% reduce 0%
13/11/19 13:49:17 INFO mapred.JobClient: map 65% reduce 0%
13/11/19 13:49:27 INFO mapred.JobClient: map 66% reduce 0%
13/11/19 13:49:36 INFO mapred.JobClient: map 67% reduce 0%
13/11/19 13:49:45 INFO mapred.JobClient: map 68% reduce 0%
13/11/19 13:49:55 INFO mapred.JobClient: map 69% reduce 0%
13/11/19 13:50:03 INFO mapred.JobClient: map 70% reduce 0%
13/11/19 13:50:17 INFO mapred.JobClient: map 71% reduce 0%
13/11/19 13:50:26 INFO mapred.JobClient: map 72% reduce 0%
13/11/19 13:50:35 INFO mapred.JobClient: map 73% reduce 0%
13/11/19 13:50:46 INFO mapred.JobClient: map 74% reduce 0%
13/11/19 13:50:56 INFO mapred.JobClient: map 75% reduce 0%
13/11/19 13:51:04 INFO mapred.JobClient: map 76% reduce 0%
13/11/19 13:51:13 INFO mapred.JobClient: map 77% reduce 0%
13/11/19 13:51:19 INFO mapred.JobClient: map 78% reduce 0%
13/11/19 13:51:33 INFO mapred.JobClient: map 79% reduce 0%
13/11/19 13:51:41 INFO mapred.JobClient: map 80% reduce 0%
13/11/19 13:51:51 INFO mapred.JobClient: map 81% reduce 0%
13/11/19 13:52:02 INFO mapred.JobClient: map 82% reduce 0%
13/11/19 13:52:07 INFO mapred.JobClient: map 83% reduce 0%
13/11/19 13:52:18 INFO mapred.JobClient: map 84% reduce 0%
13/11/19 13:52:30 INFO mapred.JobClient: map 85% reduce 0%
13/11/19 13:52:41 INFO mapred.JobClient: map 86% reduce 0%
13/11/19 13:52:54 INFO mapred.JobClient: map 87% reduce 0%
13/11/19 13:53:06 INFO mapred.JobClient: map 88% reduce 0%
13/11/19 13:53:22 INFO mapred.JobClient: map 89% reduce 0%
13/11/19 13:53:32 INFO mapred.JobClient: map 90% reduce 0%
13/11/19 13:53:37 INFO mapred.JobClient: map 91% reduce 0%
13/11/19 13:53:54 INFO mapred.JobClient: map 92% reduce 0%
13/11/19 13:54:09 INFO mapred.JobClient: map 93% reduce 0%
13/11/19 13:54:25 INFO mapred.JobClient: map 94% reduce 0%
13/11/19 13:54:34 INFO mapred.JobClient: map 95% reduce 0%
13/11/19 13:54:49 INFO mapred.JobClient: map 96% reduce 0%
13/11/19 13:55:12 INFO mapred.JobClient: map 97% reduce 0%
13/11/19 13:55:28 INFO mapred.JobClient: map 98% reduce 0%
13/11/19 13:56:00 INFO mapred.JobClient: map 99% reduce 0%
13/11/19 13:56:58 INFO mapred.JobClient: map 100% reduce 0%
13/11/19 14:19:20 INFO mapred.JobClient: map 100% reduce 1%
13/11/19 14:23:39 INFO mapred.JobClient: map 100% reduce 2%
13/11/19 14:25:37 INFO mapred.JobClient: map 100% reduce 3%
13/11/19 14:31:12 INFO mapred.JobClient: map 100% reduce 4%
13/11/19 14:34:26 INFO mapred.JobClient: map 100% reduce 5%
13/11/19 14:35:58 INFO mapred.JobClient: map 89% reduce 5%
13/11/19 14:46:54 INFO mapred.JobClient: map 79% reduce 5%
13/11/19 14:46:55 INFO mapred.JobClient: map 79% reduce 6%
13/11/19 14:53:09 INFO mapred.JobClient: map 79% reduce 7%
13/11/19 14:56:08 INFO mapred.JobClient: map 79% reduce 8%
13/11/19 14:56:50 INFO mapred.JobClient: Task Id : attempt_201310311057_0040_m_000006_0, Status : FAILED
Task attempt_201310311057_0040_m_000006_0 failed to report status for 1225 seconds. Killing!
Task attempt_201310311057_0040_m_000006_0 failed to report status for 1249 seconds. Killing!
13/11/19 14:57:59 WARN mapred.JobClient: Error reading task outputRead timed out
13/11/19 14:59:00 WARN mapred.JobClient: Error reading task outputRead timed out
13/11/19 14:59:01 INFO mapred.JobClient: map 70% reduce 8%
13/11/19 14:59:20 INFO mapred.JobClient: map 71% reduce 8%
13/11/19 15:00:50 INFO mapred.JobClient: map 71% reduce 9%
13/11/19 15:01:41 INFO mapred.JobClient: map 71% reduce 10%
13/11/19 15:01:54 INFO mapred.JobClient: map 72% reduce 10%
13/11/19 15:02:25 INFO mapred.JobClient: map 73% reduce 10%
13/11/19 15:02:34 INFO mapred.JobClient: Task Id : attempt_201310311057_0040_m_000005_0, Status : FAILED
Task attempt_201310311057_0040_m_000005_0 failed to report status for 1212 seconds. Killing!
13/11/19 15:03:16 INFO mapred.JobClient: map 74% reduce 10%
13/11/19 15:04:08 INFO mapred.JobClient: map 75% reduce 10%
13/11/19 15:04:48 INFO mapred.JobClient: map 76% reduce 10%
13/11/19 15:06:19 INFO mapred.JobClient: map 77% reduce 10%
13/11/19 15:07:35 INFO mapred.JobClient: map 77% reduce 11%
13/11/19 15:07:46 INFO mapred.JobClient: map 78% reduce 11%
13/11/19 15:09:46 INFO mapred.JobClient: map 79% reduce 11%
13/11/19 15:10:11 INFO mapred.JobClient: map 79% reduce 12%
13/11/19 15:12:00 INFO mapred.JobClient: map 80% reduce 12%
13/11/19 15:12:56 INFO mapred.JobClient: map 81% reduce 12%
13/11/19 15:13:46 INFO mapred.JobClient: map 82% reduce 12%
13/11/19 15:14:37 INFO mapred.JobClient: map 83% reduce 12%
13/11/19 15:15:36 INFO mapred.JobClient: map 84% reduce 12%
13/11/19 15:16:41 INFO mapred.JobClient: map 85% reduce 12%
13/11/19 15:17:44 INFO mapred.JobClient: map 86% reduce 12%
13/11/19 15:18:45 INFO mapred.JobClient: map 87% reduce 12%
13/11/19 15:20:22 INFO mapred.JobClient: map 88% reduce 12%
13/11/19 15:22:41 INFO mapred.JobClient: map 89% reduce 12%
13/11/19 15:23:57 INFO mapred.JobClient: Task Id : attempt_201310311057_0040_m_000004_0, Status : FAILED
Task attempt_201310311057_0040_m_000004_0 failed to report status for 1378 seconds. Killing!
Task attempt_201310311057_0040_m_000004_0 failed to report status for 1292 seconds. Killing!
13/11/19 15:24:00 INFO mapred.JobClient: map 89% reduce 13%
13/11/19 15:25:08 INFO mapred.JobClient: map 79% reduce 13%
13/11/19 15:26:44 INFO mapred.JobClient: map 69% reduce 13%
13/11/19 15:28:15 INFO mapred.JobClient: map 70% reduce 13%
13/11/19 15:28:40 INFO mapred.JobClient: map 71% reduce 13%
13/11/19 15:29:06 INFO mapred.JobClient: map 71% reduce 12%
13/11/19 15:29:31 INFO mapred.JobClient: map 72% reduce 12%
13/11/19 15:30:13 INFO mapred.JobClient: map 73% reduce 12%
13/11/19 15:30:36 INFO mapred.JobClient: Task Id : attempt_201310311057_0040_m_000003_0, Status : FAILED
Task attempt_201310311057_0040_m_000003_0 failed to report status for 1203 seconds. Killing!
13/11/19 15:30:36 INFO mapred.JobClient: Task Id : attempt_201310311057_0040_m_000002_0, Status : FAILED
Task attempt_201310311057_0040_m_000002_0 failed to report status for 1200 seconds. Killing!
13/11/19 15:30:36 INFO mapred.JobClient: Task Id : attempt_201310311057_0040_r_000006_0, Status : FAILED
Task attempt_201310311057_0040_r_000006_0 failed to report status for 1202 seconds. Killing!
13/11/19 15:31:14 INFO mapred.JobClient: map 74% reduce 12%
13/11/19 15:31:39 INFO mapred.JobClient: map 75% reduce 12%
13/11/19 15:32:29 INFO mapred.JobClient: map 76% reduce 12%
13/11/19 15:33:43 INFO mapred.JobClient: map 77% reduce 12%
13/11/19 15:34:24 INFO mapred.JobClient: map 77% reduce 13%
13/11/19 15:34:42 INFO mapred.JobClient: map 78% reduce 13%
13/11/19 15:35:02 INFO mapred.JobClient: map 78% reduce 14%
13/11/19 15:35:34 INFO mapred.JobClient: map 79% reduce 14%
13/11/19 15:36:29 INFO mapred.JobClient: map 80% reduce 14%
13/11/19 15:36:51 INFO mapred.JobClient: map 80% reduce 15%
13/11/19 15:37:12 INFO mapred.JobClient: map 81% reduce 15%
13/11/19 15:37:46 INFO mapred.JobClient: map 82% reduce 15%
13/11/19 15:38:12 INFO mapred.JobClient: map 83% reduce 15%
13/11/19 15:38:39 INFO mapred.JobClient: map 84% reduce 15%
13/11/19 15:39:18 INFO mapred.JobClient: map 85% reduce 15%
13/11/19 15:39:50 INFO mapred.JobClient: map 86% reduce 15%
13/11/19 15:40:16 INFO mapred.JobClient: map 87% reduce 15%
13/11/19 15:40:52 INFO mapred.JobClient: map 88% reduce 15%
13/11/19 15:41:18 INFO mapred.JobClient: map 89% reduce 15%
13/11/19 15:41:48 INFO mapred.JobClient: map 90% reduce 15%
13/11/19 15:42:47 INFO mapred.JobClient: map 91% reduce 15%
13/11/19 15:43:58 INFO mapred.JobClient: map 92% reduce 15%
13/11/19 15:45:36 INFO mapred.JobClient: map 93% reduce 15%
13/11/19 15:46:29 INFO mapred.JobClient: map 93% reduce 16%
13/11/19 15:46:53 INFO mapred.JobClient: map 94% reduce 16%
13/11/19 15:48:25 INFO mapred.JobClient: map 94% reduce 17%
13/11/19 15:48:56 INFO mapred.JobClient: map 95% reduce 17%
13/11/19 15:50:37 INFO mapred.JobClient: map 96% reduce 17%
13/11/19 15:51:46 INFO mapred.JobClient: map 96% reduce 18%
13/11/19 15:52:15 INFO mapred.JobClient: map 97% reduce 18%
13/11/19 15:53:08 INFO mapred.JobClient: map 97% reduce 19%
13/11/19 15:56:03 INFO mapred.JobClient: map 97% reduce 20%
13/11/19 15:56:54 INFO mapred.JobClient: map 98% reduce 20%
13/11/19 15:57:10 INFO mapred.JobClient: map 98% reduce 21%
13/11/19 15:59:26 INFO mapred.JobClient: map 99% reduce 21%
13/11/19 16:02:58 INFO mapred.JobClient: map 100% reduce 21%
13/11/19 16:03:57 INFO mapred.JobClient: map 100% reduce 22%
13/11/19 16:30:35 INFO mapred.JobClient: map 100% reduce 23%
13/11/19 16:35:00 INFO mapred.JobClient: map 100% reduce 24%
13/11/19 16:40:35 INFO mapred.JobClient: map 100% reduce 25%
13/11/19 16:40:38 INFO mapred.JobClient: map 100% reduce 26%
13/11/19 16:44:38 INFO mapred.JobClient: map 100% reduce 27%
13/11/19 16:49:08 INFO mapred.JobClient: map 100% reduce 28%
13/11/19 16:49:30 INFO mapred.JobClient: map 100% reduce 29%
13/11/19 16:52:25 INFO mapred.JobClient: map 100% reduce 33%
13/11/19 16:53:54 INFO mapred.JobClient: map 100% reduce 38%
13/11/19 16:54:10 INFO mapred.JobClient: map 100% reduce 42%
13/11/19 16:55:21 INFO mapred.JobClient: map 100% reduce 43%
13/11/19 16:55:36 INFO mapred.JobClient: map 100% reduce 47%
13/11/19 16:55:39 INFO mapred.JobClient: map 100% reduce 56%
13/11/19 16:56:40 INFO mapred.JobClient: map 100% reduce 57%
13/11/19 16:58:04 INFO mapred.JobClient: map 100% reduce 58%
13/11/19 17:01:25 INFO mapred.JobClient: map 100% reduce 59%
13/11/19 17:04:47 INFO mapred.JobClient: map 100% reduce 64%
13/11/19 17:05:01 INFO mapred.JobClient: map 100% reduce 69%
13/11/19 17:07:39 INFO mapred.JobClient: map 100% reduce 70%
13/11/19 17:10:32 INFO mapred.JobClient: map 100% reduce 71%
13/11/19 17:13:21 INFO mapred.JobClient: map 100% reduce 72%
13/11/19 17:16:08 INFO mapred.JobClient: map 100% reduce 73%
13/11/19 17:19:03 INFO mapred.JobClient: map 100% reduce 74%
13/11/19 17:21:55 INFO mapred.JobClient: map 100% reduce 75%
13/11/19 17:24:46 INFO mapred.JobClient: map 100% reduce 76%
13/11/19 17:27:35 INFO mapred.JobClient: map 100% reduce 77%
13/11/19 17:30:24 INFO mapred.JobClient: map 100% reduce 78%
13/11/19 17:33:14 INFO mapred.JobClient: map 100% reduce 79%
13/11/19 17:36:07 INFO mapred.JobClient: map 100% reduce 80%
13/11/19 17:39:00 INFO mapred.JobClient: map 100% reduce 81%
13/11/19 17:41:51 INFO mapred.JobClient: map 100% reduce 82%
13/11/19 17:44:39 INFO mapred.JobClient: map 100% reduce 83%
13/11/19 17:47:27 INFO mapred.JobClient: map 100% reduce 84%
13/11/19 17:50:22 INFO mapred.JobClient: map 100% reduce 85%
13/11/19 17:53:09 INFO mapred.JobClient: map 100% reduce 86%
13/11/19 17:55:54 INFO mapred.JobClient: map 100% reduce 87%
13/11/19 17:58:44 INFO mapred.JobClient: map 100% reduce 88%
13/11/19 18:01:35 INFO mapred.JobClient: map 100% reduce 89%
13/11/19 18:04:21 INFO mapred.JobClient: map 100% reduce 90%
13/11/19 18:07:16 INFO mapred.JobClient: map 100% reduce 91%
13/11/19 18:10:08 INFO mapred.JobClient: map 100% reduce 92%
13/11/19 18:12:55 INFO mapred.JobClient: map 100% reduce 93%
13/11/19 18:15:51 INFO mapred.JobClient: map 100% reduce 94%
13/11/19 18:18:45 INFO mapred.JobClient: map 100% reduce 95%
13/11/19 18:21:36 INFO mapred.JobClient: map 100% reduce 96%
13/11/19 18:24:25 INFO mapred.JobClient: map 100% reduce 97%
13/11/19 18:27:42 INFO mapred.JobClient: map 100% reduce 98%
13/11/19 18:31:25 INFO mapred.JobClient: map 100% reduce 99%
13/11/19 18:41:13 INFO mapred.JobClient: map 100% reduce 100%
在此期间(从地图100%减少0%到地图100%减少5%),我观察到只完成了5个地图任务(总共10个),然后其他5个因超时而失败。然后他们跑了。我知道这可以通过增加超时来解决,这不是我的问题。
我知道在Map和Reduce之间,会发生的事情是数据被提交,洗牌和排序。第一个问题。在这个数据大小的地图阶段和减少阶段之间等待这么长时间是正常的吗?感觉不对。
我的减速机有点重,所以我把它改成了减速机。但这似乎没什么帮助。这让我觉得问题出在我的映射器中,或者在洗牌/排序中。所以这是我的映射器。
public static class CliquesMapper extends
Mapper<YearTermKey, SetWritable, YearTermKey, MapWritable> {
private YearTermKey outputKEY=new YearTermKey();
public void map(YearTermKey key, SetWritable value, Context context)
throws IOException, InterruptedException {
Set<Writable> neighbors=value.keySet();
int listSize=neighbors.size();
if(listSize!=1){
for(Writable keyTerm:neighbors){
IntWritable KEYTerm=(IntWritable) keyTerm;
outputKEY.set(new Text(key.getYear()), KEYTerm);
MapWritable outputVALUE=new MapWritable();
outputVALUE.put(key.getTerm(), value);
context.write(outputKEY, outputVALUE);
}
}else{
IntWritable finalTerm=new IntWritable();
for(Writable t:neighbors){
finalTerm.set(((IntWritable) t).get());
}
outputKEY.set(key.getYear(), finalTerm);
NullWritable nw=NullWritable.get();
MapWritable outputVALUE=new MapWritable();
outputVALUE.put(key.getTerm(), nw);
context.write(outputKEY,outputVALUE);
}
}
}
第二个问题。他们从映射器中发出的键值对是否可能导致这种延迟?否则,为什么会发生这种情况?
在任何情况下,在完成所有10个地图任务后(地图100%左右,减少33%),减速器需要将近2个小时才能完成。如果它是一个身份缩减器,那怎么可能呢?
答案 0 :(得分:5)
你问了几个问题,虽然它们是相关的,但它们有不同的答案。我在下面一个接一个地回答。
使用此数据大小在地图阶段和减少阶段之间等待这么长时间是否正常?
地图和减少阶段之间存在障碍。在你的所有地图制作完成之前,你的减速器无法启动。您有一些失败的映射器,从而减慢了整个映射阶段并阻止了减少阶段。一旦解决了这个问题,你的减少阶段应该提前开始。
为什么你的地图任务失败了?显然他们没有报告进展情况:
[...] failed to report status for 1225 seconds. Killing!
我从映射器中发出的键值对是否可能导致此延迟?否则,为什么会发生这种情况?
我不确定,但我确实看了你的代码,你可以按照以下方式让它运行得更快:
1)将您的Text
转换为IntWritable
;看起来它是数字数据(一年),这样做会减少从mapper发送到reducer的数据量。请参阅此page on tips to improve Hadoop performance中的提示5。
2)重复使用你的可写文件。您在每次迭代时创建new Text
。你会感到惊讶的是,这是多么有用,并且由于堆中对象的连续创建/取消分配而导致可怕的性能。想法是创建一次可写,然后重新使用它。有关详细信息,请参阅此page on tips to improve Hadoop performance中的提示6。
虽然我无法确定这一点,但我怀疑这可能是你的一些地图制作者失败的原因。垃圾收集可能导致程序在完成时暂停,因此不会报告进度,因此Hadoop会终止任务。
3)如果您还没有这样做,请使用多个减速机。请参阅我上面链接的页面中的提示3,了解如何为您的工作设置适当数量的地图和减少任务的启发式方法。
在任何情况下,在完成所有10个地图任务后(地图100%左右,减少33%),减速器需要将近2个小时才能完成。如果它是一个身份缩减器,那怎么可能呢?
如果您有一个(或几个)缩减器的数据太多,这可能是正常行为。改组意味着在减少方面进行分类。尝试使用sort
对Linux框中的大文件进行排序。这可能需要很长时间。这就是你工作中发生的事情。