我正在尝试使用spark-submit提交Spark作业,如下所示:
SPARK_MAJOR_VERSION = 2 spark-submit --conf spark.ui.port = 4090 --driver-class-path /home/devusr/jars/greenplum-spark_2.11-1.3.0.jar --jars /home/devusr/jars/greenplum-spark_2.11-1.3.0.jar --executor-核心3-执行程序内存13G-com.partition.source.YearPartition类splinter_2.11-0.1.jar --master = yarn --keytab /home/devusr/devusr.keytab-主要devusr@DEV.COM --files /usr/hdp/current/spark2-client/conf/hive-site.xml,testconnection.properties --name分裂--conf spark.executor.extraClassPath = / home / devusr / jars / greenplum-spark_2.11-1.3.0.jar --conf spark.executor.instances = 10 --conf spark.dynamicAllocation.enabled = false --conf spark.files.maxPartitionBytes = 256M
但是该作业无法运行,而是仅打印:
SPARK_MAJOR_VERSION is set to 2, using Spark2
有人可以让我知道火花提交中使用的参数是否有特定顺序吗?
答案 0 :(得分:1)
在spark-submit
的{{1}}模式下使用cluster
的格式是
https://spark.apache.org/docs/2.1.0/running-on-yarn.html
yarn
如果$ ./bin/spark-submit --class path.to.your.Class --master yarn --deploy-mode cluster [options] <app jar> [app options]
是包含类splinter_2.11-0.1.jar
的jar,您可以尝试使用它吗:
com.partition.source.YearPartition