如何将XCom消息从PythonOperator任务传递到Airflow中的SparkSubmitOperator任务

时间:2018-08-20 20:28:31

标签: python airflow

说我有一个PythonOperator任务将消息推送到XCom,如何在SparkSubmitOperator中提取此消息?

def get_some_value(**kwargs):
    some_value = 10
    return some_value

task1 = PythonOperator(task_id='run_task_1',
                       python_callable=get_some_value,
                       provide_context=True,
                       dag=dag)

task2 = SparkSubmitOperator(
    task_id='run_sparkSubmit_job',
    conn_id='spark_default',
    java_class='com.example',
    application='example.jar',
    name='airflow-spark-job',
    verbose=True,
    application_args=["some_value"],   #<---I want to use some_value from task1 here
    conf={'master':'yarn'},
    dag=dag,
)

task1 >> task2

1 个答案:

答案 0 :(得分:3)

在TaskInstance(ti)宏arg上使用xcom_pull加载task1返回的变量。使用任务ID“ run_task_1”来检索变量:

def get_some_value(**kwargs):
    some_value = 10
    return some_value

task1 = PythonOperator(task_id='run_task_1',
                       python_callable=get_some_value,
                       provide_context=True,
                       dag=dag)

task2 = SparkSubmitOperator(
    task_id='run_sparkSubmit_job',
    conn_id='spark_default',
    java_class='com.example',
    application='example.jar',
    name='airflow-spark-job',
    verbose=True,
    application_args=["{{ti.xcom_pull(task_ids='run_task_1')}}"],  
    conf={'master':'yarn'},
    dag=dag,
)

application_args支持以下jinja模板,因为它是模板变量: 参见:https://github.com/apache/incubator-airflow/blob/v1-10-stable/airflow/contrib/operators/spark_submit_operator.py#L87