我一直试图激发提交给我的cloudera集群几周。我真的希望有人知道这是如何运作的。
我创建了一个脚本,它使用所有必需的参数调用spark-submit。屏幕转储出以下行
Using properties file: null
Using properties file: null
Parsed arguments:
master yarn
deployMode cluster
executorMemory null
executorCores null
totalExecutorCores null
propertiesFile null
driverMemory null
driverCores null
driverExtraClassPath /home/bruce/workspace1/spark-cloudera/yarn/stable/target/spark-yarn_2.10-1.0.0-cdh5.1.0.jar:/home/bruce/.m2/repository/org/apache/hadoop/hadoop-yarn-client/2.3.0-cdh5.1.0/hadoop-yarn-client-2.3.0-cdh5.1.0.jar:/home/bruce/.m2/repository/org/apache/hadoop/hadoop-common/2.3.0-cdh5.1.0/hadoop-common-2.3.0-cdh5.1.0.jar:/home/bruce/.m2/repository/org/apache/hadoop/hadoop-yarn-api/2.3.0-cdh5.1.0/hadoop-yarn-api-2.3.0-cdh5.1.0.jar:/home/bruce/.m2/repository/org/apache/hadoop/hadoop-yarn-common/2.3.0-cdh5.1.0/hadoop-yarn-common-2.3.0-cdh5.1.0.jar:/home/bruce/.m2/repository/org/apache/hadoop/hadoop-auth/2.3.0-cdh5.1.0/hadoop-auth-2.3.0-cdh5.1.0.jar:/home/bruce/.m2/repository/com/google/protobuf/protobuf-java/2.5.0/protobuf-java-2.5.0.jar
driverExtraLibraryPath null
driverExtraJavaOptions null
supervise false
queue null
numExecutors null
files null
pyFiles null
archives null
mainClass org.apache.spark.examples.SparkPi
primaryResource file:/home/bruce/workspace1/spark-cloudera/examples/target/scala-2.10/spark-examples-1.0.0-cdh5.1.0-hadoop2.3.0-cdh5.1.0.jar
name org.apache.spark.examples.SparkPi
childArgs [10]
jars null
verbose true
log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
电话会在很长一段时间内停止,然后退出连接被拒绝。
我不明白的是参数指定使用YarnClient,但没有在哪里表明它知道如何联系纱线资源管理器,而不是ip,而不是端口。提交是在我的笔记本电脑上进行的,群集在邻近的子网上。 spark-submit如何确定如何联系纱线服务?
答案 0 :(得分:2)
确保HADOOP_CONF_DIR或YARN_CONF_DIR指向该目录 其中包含Hadoop的(客户端)配置文件 簇。这些配置用于写入dfs并连接到 YARN ResourceManager。