Question

我创建了具有4个节点的HDinsight群集。当我给命令“yarn node -list”时，它显示：

Node-Id          Node-State Node-Http-Address  Number-of-Running-Containers
10.x.x.x:xxxxx     RUNNING 10.x.x.x:xxxxx             0
10.x.x.x:xxxxx     RUNNING 10.x.x.x:xxxxx             0 
10.x.x.x:xxxxx     RUNNING 10.x.x.x:xxxxx             0
10.x.x.x:xxxxx     RUNNING 10.x.x.x:xxxxx             0

在我提交hive作业后，它会计算映射器（例如：900）和减速器（例如：100）。然后，如果我检查节点详细信息，它会显示所有节点中值为8的正在运行的容器。如果我提交简单的工作，容器的值为2或1随机分配给任何工作节点。

   1. we know that, the mapper/reducer job are assigned to worker nodes. here, is it 4 worker nodes are processing the work or containers are processing?
   2. basically, what is number of containers. 
   3. how the value of running containers assigned/changed?

Answer 1

在azure HDinsight中创建hadoop群集期间，您将在该存储帐户中配置存储帐户和容器。 blob容器用作群集的默认存储位置。（可选）您可以指定群集可以访问的其他Azure存储帐户（链接存储）。群集还可以访问任何配置为仅具有完全公共读取访问权限或仅对blob进行公共读取访问的blob容器。

https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-hadoop-provision-linux-clusters

什么是天蓝色的运行容器数量？

1 个答案: