Hadoop调度| 2个极端|资源的可用性和稀缺性

时间:2018-11-10 15:28:37

标签: hadoop mapreduce hadoop2

我有以下6个Datanodes(dn):

  • dn1-6核6GB-3个地图槽和3个减少槽
  • dn2-6核6GB-3个映射插槽和3个减少插槽
  • dn3-6核6GB-3个地图槽和3个减少槽
  • dn4-6核6GB-3个映射插槽和3个减少插槽
  • dn5-6核6GB-3个地图槽和3个减少槽
  • dn6-6核6GB-3个映射插槽和3个Reduce插槽

案例1(可用性)

State of the system
===================
dn1 has 1 mapper running from another job Y; so 2 mapper slots are free
dn2 has 1 mapper running from another job Y; so 2 mapper slots are free
dn3 has 1 mapper running from another job Y; so 2 mapper slots are free
dn4 has 1 mapper running from another job Y; so 2 mapper slots are free
dn5 has 0 mappers running; so 3 mapper slots are free
dn6 has 0 mappers running; so 3 mapper slots are free
State of my input file
======================
I have a file that is distributed in 3 64MB blocks with RF 3 in the following way: 
R1(dn1,dn2,dn3) 
R2(dn2,dn3,dn4) 
R3(dn3,dn4,dn5)

当我在此文件上运行作业X时,需要创建3个对应于3个数据块的映射器。

问题

In FIFO: Is Job X still put on the queue waiting for Y to finish considering its a FIFO scheduler and other jobs are running even though "there are other mapper slots free in the same machine" or the FIFO logic kicks in only when no more resources are available in the system and the jobs consequently has to be put on the queue?
In Capacity Scheduler: What would the behavior be?
In Fair Share Scheduler: What would the behavior be?

情况2(稀有度)

State of the system
===================
dn1 has 3 mappers running from another job Y; so 0 mapper slots are free
dn2 has 3 mappers running from another job Y; so 0 mapper slots are free
dn3 has 3 mappers running from another job Y; so 0 mapper slots are free
dn4 has 3 mappers running from another job Y; so 0 mapper slots are free
dn5 has 3 mappers running from another job Y; so 0 mapper slots are free
dn6 has 0 mappers running; so 3 mapper slots are free

我有一个文件,该文件通过以下方式以3个64MB的块与RF 3一起分发:

R1(dn1,dn2,dn3) 
R2(dn2,dn3,dn4) 
R3(dn3,dn4,dn5)

当我在此文件上运行作业X时,需要创建3个对应于3个数据块的映射器。

问题

现在会发生什么:

    - Are the 3 mapper tasks created on dn6 (which does not have any of the data blocks of the input file yet) and corresponding data block transferred over the network from say dn1 to dn6?
        - If yes, does this same behaviour show in the case of all the three schedulers: FIFO/Capacity/Fair Share?
            - If no, then can you elaborate on the behaviour shown for this use case in case of:
                - FIFO Scheduler
                - Capacity Scheduler
                - Fair Share Scheduler

0 个答案:

没有答案