Question

我编写了以下shell脚本来配置纱线调度程序，但这不能正常工作-当我将该脚本设置为输入参数时，Dataproc集群的创建失败。

您知道如何解决此问题吗？

下面是脚本：

#!/usr/bin/env bash

echo "<allocations>" >> /etc/hadoop/conf/fair-scheduler.xml
echo "  <userMaxAppsDefault>999</userMaxAppsDefault>" >> /etc/hadoop/conf/fair-scheduler.xml
echo "  <queueMaxAppsDefault>999</queueMaxAppsDefault>" >> /etc/hadoop/conf/fair-scheduler.xml
echo "</allocations>" >> /etc/hadoop/conf/fair-scheduler.xml

sed -i '$ d' /etc/hadoop/conf/yarn-site.xml

echo "  <property>" >> /etc/hadoop/conf/yarn-site.xml
echo "    <name>yarn.scheduler.fair.allocation.file</name>" >> /etc/hadoop/conf/yarn-site.xml
echo "    <value>/etc/hadoop/conf/fair-scheduler.xml</value>" >> /etc/hadoop/conf/yarn-site.xml
echo "  </property>" >> /etc/hadoop/conf/yarn-site.xml
echo "</configuration>" >> /etc/hadoop/conf/yarn-site.xml

systemctl restart hadoop-yarn-resourcemanager.service

Answer 1

您需要使用Dataproc initialization action在Dataproc上配置YARN Fair Scheduler。

您可以查看此答案以获取有关如何完成此操作的示例：https://stackoverflow.com/a/49693693/3227693

Spark读取文件扩展名

1 个答案: