在集群模式下执行Spark作业

时间:2019-06-18 15:34:06

标签: apache-spark pyspark yarn

我正在尝试使用以下命令以集群模式执行pyspark应用程序

spark-submit --conf spark.yarn.appMasterEnv.PYSPARK_PYTHON=./app/myapplication/bin/python --master yarn --deploy-mode cluster --queue dev --archives /opt/myapplication.zip#app /bin/first_pipeline.py
--archives - Shipping my whole conda environment
spark.yarn.appMasterEnv.PYSPARK_PYTHON - setting python interpreter
--queue - which Yarn queue to be used
first_pipeline.py - is the file which I want to execute (this file present inside bin folder myapplication/bin/first_pipeline.py)

我收到一条错误消息

  

无法从JAR文件/ dev

加载主要类别

(我从主目录执行此命令)

命令中我缺少什么?

0 个答案:

没有答案