Question

我是Hadoop的新手。当我运行一个作业时，我看到该作业的总资源分配为251248654 MB-seconds，24462 vcore-seconds。但是，当我找到有关群集的详细信息时，它显示有888个Vcores-total和15.90 TB Memory-total。谁能告诉我这是如何相关的？ MB-second和Vcore-seconds对工作的参考是什么。

网上有什么资料可以了解这些吗？我试着冲浪，得到一个正确的答案

Answer 1

VCores-Total: Indicates the total number of VCores available in the cluster
Memory-Total: Indicates the total memory available in the cluster.

例如我有一个单节点集群，其中，我已将每个容器的内存要求设置为：1228 MB（由config：yarn.scheduler.minimum-allocation-mb确定）和每个容器的vCore为1 vCore（由config：yarn.scheduler.minimum-allocation-vcores确定）

我已将yarn.nodemanager.resource.memory-mb设置为9830 MB。因此，每个节点总共可以有8个容器（9830/1228 = 8）。

所以，对于我的群集：

VCores-Total = 1 (node) * 8 (containers) * 1 (vCore per container) = 8 
Memory-Total = 1 (node) * 8 (containers) * 1228 MB (memory per container) = 9824 MB = 9.59375 GB = 9.6 GB

下图显示了我的群集指标：

现在让我们看看“MB-seconds”和“vcore-seconds”。根据代码中的描述（ApplicationResourceUsageReport.java）：

MB-seconds ：应用程序分配的累计内存量（以兆字节为单位）乘以应用程序运行的秒数。

vcore-seconds ：应用程序分配的汇总数量乘以应用程序运行的秒数。

描述是自我解释的（记住关键词：聚合）。

让我用一个例子解释一下。我运行了一个DistCp作业（产生了25个容器），我得到了以下内容：

Aggregate Resource Allocation : 10361661 MB-seconds, 8424 vcore-seconds

现在，让我们粗略计算每个容器花费的时间：

For memory:
10361661 MB-seconds = 10361661 / 25 (containers) / 1228 MB (memory per container) = 337.51 seconds = 5.62 minutes

For CPU
8424 vcore-seconds = 8424 / 25 (containers) / 1 (vCore per container) = 336.96 seconds = 5.616 minutes

这表示平均每个容器需要5.62分钟才能执行。

希望这说清楚。您可以执行一项工作并自行确认。

YARN中作业的聚合资源分配

1 个答案: