说我有一个PythonOperator任务将消息推送到XCom,如何在SparkSubmitOperator中提取此消息?
def get_some_value(**kwargs):
some_value = 10
return some_value
task1 = PythonOperator(task_id='run_task_1',
python_callable=get_some_value,
provide_context=True,
dag=dag)
task2 = SparkSubmitOperator(
task_id='run_sparkSubmit_job',
conn_id='spark_default',
java_class='com.example',
application='example.jar',
name='airflow-spark-job',
verbose=True,
application_args=["some_value"], #<---I want to use some_value from task1 here
conf={'master':'yarn'},
dag=dag,
)
task1 >> task2
答案 0 :(得分:3)
在TaskInstance(ti)宏arg上使用xcom_pull加载task1返回的变量。使用任务ID“ run_task_1”来检索变量:
def get_some_value(**kwargs):
some_value = 10
return some_value
task1 = PythonOperator(task_id='run_task_1',
python_callable=get_some_value,
provide_context=True,
dag=dag)
task2 = SparkSubmitOperator(
task_id='run_sparkSubmit_job',
conn_id='spark_default',
java_class='com.example',
application='example.jar',
name='airflow-spark-job',
verbose=True,
application_args=["{{ti.xcom_pull(task_ids='run_task_1')}}"],
conf={'master':'yarn'},
dag=dag,
)
application_args支持以下jinja模板,因为它是模板变量: 参见:https://github.com/apache/incubator-airflow/blob/v1-10-stable/airflow/contrib/operators/spark_submit_operator.py#L87