有关Hadoop shuffle阶段的详细信息

时间:2018-06-27 21:24:38

标签: hadoop mapreduce hadoop2

我正在模拟一个具有8个节点的Hadoop集群,并使用TeraSort将MapReduce作业仅模拟到一个Reducer

  

地图输出文件位于以下计算机的本地磁盘上:   运行地图任务(请注意,尽管地图输出总是被写入   本地磁盘,reduce输出可能不是),但是现在需要   将要运行该分区的reduce任务的计算机   引用

“ Hadoop:权威指南,第7章:简化方面”

我有一个问题。从日志中可以看到,在Hadoop的复制阶段(也称为shuffle):

  • 从映射器转移到化简器的单个流
  • 单个流的开始和结束时间(仅混洗阶段)
  • 从每个映射器传输到化简器的数据量

Shuffle and sort in MapReduce

我在hadoop-user-datanode-slaveX.log文件中仅找到此信息:

....    
    2018-06-27 15:09:32,714 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-1726097632-172.17.0.2-1530090285799:blk_1073741859_1035, type=LAST_IN_PIPELINE terminating
    2018-06-27 15:09:32,720 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-1726097632-172.17.0.2-1530090285799:blk_1073741860_1036 src: /10.0.0.8:41648 dest: /10.0.0.8:50010
    2018-06-27 15:09:34,502 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.0.0.8:41648, dest: /10.0.0.8:50010, bytes: 134217728, op: HDFS_WRITE, cliID: DFSClient_attempt_1530111954627_0002_r_000000_0_260756741_1, offset: 0, srvID: 27c685b7-802f-4c80-997e-b4b1c7dfc986, blockid: BP-1726097632-172.17.0.2-1530090285799:blk_1073741860_1036, duration(ns): 1780668464
    2018-06-27 15:09:34,502 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-1726097632-172.17.0.2-1530090285799:blk_1073741860_1036, type=LAST_IN_PIPELINE terminating
    2018-06-27 15:09:34,507 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-1726097632-172.17.0.2-1530090285799:blk_1073741861_1037 src: /10.0.0.8:41650 dest: /10.0.0.8:50010
    2018-06-27 15:09:36,198 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.0.0.8:41650, dest: /10.0.0.8:50010, bytes: 134217728, op: HDFS_WRITE, cliID: DFSClient_attempt_1530111954627_0002_r_000000_0_260756741_1, offset: 0, srvID: 27c685b7-802f-4c80-997e-b4b1c7dfc986, blockid: BP-1726097632-172.17.0.2-1530090285799:blk_1073741861_1037, duration(ns): 1688832489
    2018-06-27 15:09:36,198 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-1726097632-172.17.0.2-1530090285799:blk_1073741861_1037, type=LAST_IN_PIPELINE terminating
    2018-06-27 15:09:36,211 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-1726097632-172.17.0.2-1530090285799:blk_1073741862_1038 src: /10.0.0.8:41652 dest: /10.0.0.8:50010
    2018-06-27 15:09:37,968 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.0.0.8:41652, dest: /10.0.0.8:50010, bytes: 134217728, op: HDFS_WRITE, cliID: DFSClient_attempt_1530111954627_0002_r_000000_0_260756741_1, offset: 0, srvID: 27c685b7-802f-4c80-997e-b4b1c7dfc986, blockid: BP-1726097632-172.17.0.2-1530090285799:blk_1073741862_1038, duration(ns): 1755027949
    2018-06-27 15:09:37,968 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-1726097632-172.17.0.2-1530090285799:blk_1073741862_1038, type=LAST_IN_PIPELINE terminating
    2018-06-27 15:09:37,973 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-1726097632-172.17.0.2-1530090285799:blk_1073741863_1039 src: /10.0.0.8:41654 dest: /10.0.0.8:50010
    2018-06-27 15:09:39,695 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.0.0.8:41654, dest: /10.0.0.8:50010, bytes: 134217728, op: HDFS_WRITE, cliID: DFSClient_attempt_1530111954627_0002_r_000000_0_260756741_1, offset: 0, srvID: 27c685b7-802f-4c80-997e-b4b1c7dfc986, blockid: BP-1726097632-172.17.0.2-1530090285799:blk_1073741863_1039, duration(ns): 1719843826
    2018-06-27 15:09:39,695 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-1726097632-172.17.0.2-1530090285799:blk_1073741863_1039, type=LAST_IN_PIPELINE terminating
    2018-06-27 15:09:39,710 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-1726097632-172.17.0.2-1530090285799:blk_1073741864_1040 src: /10.0.0.8:41656 dest: /10.0.0.8:50010
    2018-06-27 15:09:41,378 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.0.0.8:41656, dest: /10.0.0.8:50010, bytes: 134217728, op: HDFS_WRITE, cliID: DFSClient_attempt_1530111954627_0002_r_000000_0_260756741_1, offset: 0, srvID: 27c685b7-802f-4c80-997e-b4b1c7dfc986, blockid: BP-1726097632-172.17.0.2-1530090285799:blk_1073741864_1040, duration(ns): 1664855767
    2018-06-27 15:09:41,378 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-1726097632-172.17.0.2-1530090285799:blk_1073741864_1040, type=LAST_IN_PIPELINE terminating
    2018-06-27 15:09:41,390 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-1726097632-172.17.0.2-1530090285799:blk_1073741865_1041 src: /10.0.0.8:41658 dest: /10.0.0.8:50010
    2018-06-27 15:09:43,055 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.0.0.8:41658, dest: /10.0.0.8:50010, bytes: 134217728, op: HDFS_WRITE, cliID: DFSClient_attempt_1530111954627_0002_r_000000_0_260756741_1, offset: 0, srvID: 27c685b7-802f-4c80-997e-b4b1c7dfc986, blockid: BP-1726097632-172.17.0.2-1530090285799:blk_1073741865_1041, duration(ns): 1662899723
    2018-06-27 15:09:43,055 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-1726097632-172.17.0.2-1530090285799:blk_1073741865_1041, type=LAST_IN_PIPELINE terminating
    2018-06-27 15:09:43,060 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-1726097632-172.17.0.2-1530090285799:blk_1073741866_1042 src: /10.0.0.8:41660 dest: /10.0.0.8:50010
    2018-06-27 15:09:44,778 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.0.0.8:41660, dest: /10.0.0.8:50010, bytes: 134217728, op: HDFS_WRITE, cliID: DFSClient_attempt_1530111954627_0002_r_000000_0_260756741_1, offset: 0, srvID: 27c685b7-802f-4c80-997e-b4b1c7dfc986, blockid: BP-1726097632-172.17.0.2-1530090285799:blk_1073741866_1042, duration(ns): 1717463627
    2018-06-27 15:09:44,779 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-1726097632-172.17.0.2-1530090285799:blk_1073741866_1042, type=LAST_IN_PIPELINE terminating
....

其中 src dest 是slaveX(缩减程序)的IP地址。似乎该节点在其自身上进行写入。 相反,从维度上看,这些似乎是来自映射器的流

我正在尝试在syslog.shuffle文件中查找信息,但是不清楚是否跟踪了数据流传输

2018-06-28 22:23:35,136 INFO [main] org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl: MergerManager: memoryLimit=1068708672, maxSingleShuffleLimit=267177168, mergeThreshold=705347776, ioSortFactor=10, memToMemMergeOutputsThreshold=10
2018-06-28 22:23:35,260 INFO [EventFetcher for fetching Map Completion Events] org.apache.hadoop.mapreduce.task.reduce.EventFetcher: attempt_1530224442376_0002_r_000000_0 Thread started: EventFetcher for fetching Map Completion Events
2018-06-28 22:23:35,338 INFO [EventFetcher for fetching Map Completion Events] org.apache.hadoop.mapreduce.task.reduce.EventFetcher: attempt_1530224442376_0002_r_000000_0: Got 11 new map-outputs
2018-06-28 22:23:36,100 INFO [fetcher#5] org.apache.hadoop.mapreduce.task.reduce.Fetcher: fetcher#5 about to shuffle output of map attempt_1530224442376_0002_m_000023_0 decomp: 139586410 len: 139586414 to MEMORY
2018-06-28 22:23:36,720 INFO [fetcher#2] org.apache.hadoop.mapreduce.task.reduce.Fetcher: fetcher#2 about to shuffle output of map attempt_1530224442376_0002_m_000012_0 decomp: 139586410 len: 139586414 to MEMORY
2018-06-28 22:23:36,732 INFO [EventFetcher for fetching Map Completion Events] org.apache.hadoop.mapreduce.task.reduce.EventFetcher: attempt_1530224442376_0002_r_000000_0: Got 3 new map-outputs
2018-06-28 22:23:36,736 INFO [fetcher#1] org.apache.hadoop.mapreduce.task.reduce.Fetcher: fetcher#1 about to shuffle output of map attempt_1530224442376_0002_m_000028_0 decomp: 125789874 len: 125789878 to MEMORY
2018-06-28 22:23:37,382 INFO [fetcher#1] org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput: Read 125789874 bytes from map-output for attempt_1530224442376_0002_m_000028_0
2018-06-28 22:23:37,553 INFO [fetcher#4] org.apache.hadoop.mapreduce.task.reduce.Fetcher: fetcher#4 about to shuffle output of map attempt_1530224442376_0002_m_000001_0 decomp: 139586410 len: 139586414 to MEMORY
2018-06-28 22:23:37,553 INFO [fetcher#1] org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 125789874, inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->544549104
2018-06-28 22:23:37,573 INFO [fetcher#1] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: slave4:13562 freed by fetcher#1 in 2221ms
2018-06-28 22:23:37,948 INFO [fetcher#1] org.apache.hadoop.mapreduce.task.reduce.Fetcher: fetcher#1 about to shuffle output of map attempt_1530224442376_0002_m_000017_0 decomp: 139586514 len: 139586518 to MEMORY
2018-06-28 22:23:38,914 INFO [EventFetcher for fetching Map Completion Events] org.apache.hadoop.mapreduce.task.reduce.EventFetcher: attempt_1530224442376_0002_r_000000_0: Got 3 new map-outputs
2018-06-28 22:23:39,110 INFO [fetcher#1] org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput: Read 139586514 bytes from map-output for attempt_1530224442376_0002_m_000017_0
2018-06-28 22:23:39,110 INFO [fetcher#1] org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 139586514, inMemoryMapOutputs.size() -> 2, commitMemory -> 125789874, usedMemory ->684135618
2018-06-28 22:23:39,110 INFO [fetcher#1] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: slave4:13562 freed by fetcher#1 in 1537ms
2018-06-28 22:23:39,519 INFO [fetcher#1] org.apache.hadoop.mapreduce.task.reduce.Fetcher: fetcher#1 about to shuffle output of map attempt_1530224442376_0002_m_000019_0 decomp: 139586410 len: 139586414 to MEMORY
2018-06-28 22:23:39,596 INFO [fetcher#4] org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput: Read 139586410 bytes from map-output for attempt_1530224442376_0002_m_000001_0
2018-06-28 22:23:39,597 INFO [fetcher#4] org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 139586410, inMemoryMapOutputs.size() -> 3, commitMemory -> 265376388, usedMemory ->823722028
2018-06-28 22:23:39,823 INFO [fetcher#1] org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput: Read 139586410 bytes from map-output for attempt_1530224442376_0002_m_000019_0
2018-06-28 22:23:39,911 INFO [fetcher#4] org.apache.hadoop.mapreduce.task.reduce.Fetcher: fetcher#4 about to shuffle output of map attempt_1530224442376_0002_m_000022_0 decomp: 139586410 len: 139586414 to MEMORY
2018-06-28 22:23:39,912 INFO [fetcher#1] org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 139586410, inMemoryMapOutputs.size() -> 4, commitMemory -> 404962798, usedMemory ->963308438
2018-06-28 22:23:40,075 INFO [fetcher#1] org.apache.hadoop.mapreduce.task.reduce.Fetcher: fetcher#1 about to shuffle output of map attempt_1530224442376_0002_m_000003_0 decomp: 139586514 len: 139586518 to MEMORY
2018-06-28 22:23:40,085 INFO [EventFetcher for fetching Map Completion Events] org.apache.hadoop.mapreduce.task.reduce.EventFetcher: attempt_1530224442376_0002_r_000000_0: Got 1 new map-outputs
2018-06-28 22:23:40,333 INFO [fetcher#1] org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput: Read 139586514 bytes from map-output for attempt_1530224442376_0002_m_000003_0
2018-06-28 22:23:40,333 INFO [fetcher#1] org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 139586514, inMemoryMapOutputs.size() -> 5, commitMemory -> 544549208, usedMemory ->1102894952
2018-06-28 22:23:40,334 INFO [fetcher#1] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: slave4:13562 freed by fetcher#1 in 1224ms
2018-06-28 22:23:40,696 INFO [fetcher#5] org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput: Read 139586410 bytes from map-output for attempt_1530224442376_0002_m_000023_0
2018-06-28 22:23:40,696 INFO [fetcher#5] org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 139586410, inMemoryMapOutputs.size() -> 6, commitMemory -> 684135722, usedMemory ->1102894952
2018-06-28 22:23:40,696 INFO [fetcher#5] org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl: Starting inMemoryMerger's merge since commitMemory=823722132 > mergeThreshold=705347776. Current usedMemory=1102894952
2018-06-28 22:23:40,697 INFO [fetcher#5] org.apache.hadoop.mapreduce.task.reduce.MergeThread: InMemoryMerger - Thread to merge in-memory shuffled map-outputs: Starting merge with 6 segments, while ignoring 0 segments
2018-06-28 22:23:40,707 INFO [fetcher#5] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: slave1:13562 freed by fetcher#5 in 5377ms
2018-06-28 22:23:40,732 INFO [fetcher#1] org.apache.hadoop.mapreduce.task.reduce.Fetcher: fetcher#1 - MergeManager returned status WAIT ...
2018-06-28 22:23:40,733 INFO [fetcher#1] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: slave1:13562 freed by fetcher#1 in 25ms
2018-06-28 22:23:40,748 INFO [fetcher#3] org.apache.hadoop.mapreduce.task.reduce.Fetcher: fetcher#3 - MergeManager returned status WAIT ...
2018-06-28 22:23:40,792 INFO [fetcher#3] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: slave1:13562 freed by fetcher#3 in 57ms
2018-06-28 22:23:40,992 INFO [fetcher#4] org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput: Read 139586410 bytes from map-output for attempt_1530224442376_0002_m_000022_0
2018-06-28 22:23:41,088 INFO [fetcher#4] org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 139586410, inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->1102894952
2018-06-28 22:23:41,088 INFO [fetcher#4] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: slave7:13562 freed by fetcher#4 in 4352ms

谢谢

0 个答案:

没有答案