如何解释MapReduce性能计数器

时间:2015-06-29 11:31:22

标签: hadoop mapreduce

更具体一点:

  1. 在任务计数器中,CPU花费来自proc / stat的utime + stime,因此它意味着IOWait之类的东西不会被计算在内。是吗?
  2. 整个任务的经过时间比花费计数器的CPU时间长很多,这是否意味着节点非常忙,容器没有CPU或等待很长时间的IO?
  3. 如何从计数器判断任务是CPU绑定还是IO计数?

1 个答案:

答案 0 :(得分:1)

'的 CPU_MILLISECONDS ' counter可以为您提供有关 - 所有任务在CPU上花费的总时间的信息。

'的 REDUCE_SHUFFLE_BYTES '数字越大,n / w利用率越高。 (更多的选择可以这样) enter image description here

Hadoop中有4类计数器:文件系统,作业,框架和自定义。

您可以使用内置计数器验证:

1.The correct number of bytes was read and written
2.The correct number of tasks was launched and successfully ran
3.The amount of CPU and memory consumed is appropriate for your job and cluster nodes
4.The correct number of records was read and written 

更多信息avalible @ https://www.mapr.com/blog/managing-monitoring-and-testing-mapreduce-jobs-how-work-counters#.VZy9IF_vPZ4(** credits- mapr.com)