我正在尝试使用SPARK_SUBMIT_OPERATOR调用SPARK_SUBMIT,在执行SPARK_SUBMIT之前,我必须设置SPARK_MAJOR_VERSION和HADOOP_USER_NAME。有人可以帮我吗?
我试图以YARN模式运行,我已通过env_vars。仍未设置SPARK_MAJOR_VERSION。信息-[2019-03-11 21:07:03,525] {base_hook.py:83}信息-使用连接到:id:spark_default。主机:yarn:// XXXX,端口:8088,架构:无,登录名:peddnade,密码:XXXXXXXX,额外:{u'queue':u'priority',u'namespace':u'default',u'spark -home':u'/ usr /'} [2019-03-11 21:07:03,526] {logging_mixin.py:95}信息-[2019-03-11 21:07:03,526] {spark_submit_hook.py:283}信息-Spark-Submit cmd:[u' / usr / bin / spark-submit','--master','yarn:/ XX:8088','--conf','spark.dynamicAllocation.enabled = true','--conf','spark。 hadoop.mapreduce.fileoutputcommitter.algorithm.version = 1','--conf','spark.app.name = RDM','--conf','spark.yarn.queue = priority','--conf' ,'spark.shuffle.service.enabled = true','--conf','spark.yarn.appMasterEnv.SPARK_MAJOR_VERSION = 2','--conf','spark.yarn.appMasterEnv.HADOOP_USER_NAME = ppeddnade',' --files','/ usr / hdp / current / spark-client / conf / hive-site.xml','-jars','/ usr / hdp / current / spark-client / lib / datanucleus-api- jdo-3.2.6.jar,/ usr / hdp / current / spark-client / lib / datanucleus-rdbms-3.2.9.jar,/ usr / hdp / current / spark-client / lib / datanucleus-core-3.2。 10.jar”,“-num-executors”,“ 4”,“-total-executor-cores”,“ 4”,“-executor-cores”,“ 4”,“-executor-memory” ,'5g','-驱动程序内存','10g','-name',u'airflow-spark-example','-class', 'com.hilton.eim.job.SubmitSparkJob','-queue',u'priority','/ home / ppeddnade / XX.jar',u'XX'] [2019-03-11 21:07:03,542] {logging_mixin.py:95}信息-[2019-03-11 21:07:03,542] {spark_submit_hook.py:415}信息-已安装多个版本的Spark,但SPARK_MAJOR_VERSION没有设置 [2019-03-11 21:07:03,542] {logging_mixin.py:95}信息-[2019-03-11 21:07:03,542] {spark_submit_hook.py:415}信息-默认情况下会选择Spark1 >
答案 0 :(得分:0)
SparkSubmitOperator
提供了env_vars
参数,用于设置您的环境变量(也可用in SparkSubmitHook
)
:param env_vars:提交火花的环境变量。它 也支持yarn和k8s模式。 (已模板化)
您可以尝试推断其用法from test_spark_submit_hook.py
hook = SparkSubmitHook(conn_id='spark_standalone_cluster_client_mode',
env_vars={"bar": "foo"})
即使您没有要求,也可能希望在远程群集上执行 spark-submit
,因为它可以查看available options