在Airflow中的组件之间传输数据

时间:2019-04-05 09:01:58

标签: python publish-subscribe airflow apache-airflow-xcom

我对Airflow并不陌生,已经阅读了大部分文档。从文档中,我了解可以使用XCom类共享DAG中组件之间的小数据。 DAG中发布数据的组件必须推送,而订阅数据的组件必须推送。

但是,关于推和拉的语法部分我不太清楚。我指的是documentation上有关XCom的部分,并开发了代码模板。假设我有下面的代码,其中只有两个部分,一个推杆和一个拉杆。推送程序发布当前的时间,该时间必须由推送程序消耗并写入日志文件。

from datetime import datetime
from airflow import DAG
from airflow.operators.python_operator import PythonOperator

log_file_location = '/usr/local/airflow/logs/time_log.log'

default_args = {'owner':'apache'}
dag = DAG('pushpull', default_args = default_args)

def push_function():
    #push this data on the DAG as key-value pair
    return(datetime.now()) #current time

def pull_function():
    with open(log_file_location, 'a') as logfile:
        current_time = '' #pull data from the pusher as key - value pair
        logfile.writelines('current time = '+current_time)
    logfile.close()

with dag:
    t1 = PythonOperator(
        task_id = 'pusher', 
        python_callable = push_function)

    t2 = PythonOperator(
        task_id = 'puller', 
        python_callable = pull_function)

    t2.set_upstream(t1)

在这里,我需要来自气流大师的两种语法帮助:

  1. 如何从推功能中连同键一起推数据
  2. 如何获取pull函数使用键提取数据。

谢谢!

1 个答案:

答案 0 :(得分:1)

使用按键推送到Xcom的示例:

def push_function(**context):
    msg='the_message'
    print("message to push: '%s'" % msg)
    task_instance = context['task_instance']
    task_instance.xcom_push(key="the_message", value=msg)

使用键将其拉到Xcom的示例:

def pull_function(**kwargs):
    ti = kwargs['ti']
    msg = ti.xcom_pull(task_ids='push_task',key='the_message')
    print("received message: '%s'" % msg)

示例DAG:

from datetime import datetime, timedelta
from airflow.models import DAG
from airflow.operators.python_operator import PythonOperator

DAG = DAG(
  dag_id='simple_xcom',
  start_date=datetime(2017, 10, 26),
  schedule_interval=timedelta(1)
)

def push_function(**context):
    msg='the_message'
    print("message to push: '%s'" % msg)
    task_instance = context['task_instance']
    task_instance.xcom_push(key="the_message", value=msg)

push_task = PythonOperator(
    task_id='push_task', 
    python_callable=push_function,
    provide_context=True,
    dag=DAG)

def pull_function(**kwargs):
    ti = kwargs['ti']
    msg = ti.xcom_pull(task_ids='push_task',key='the_message')
    print("received message: '%s'" % msg)

pull_task = PythonOperator(
    task_id='pull_task', 
    python_callable=pull_function,
    provide_context=True,
    dag=DAG)

push_task >> pull_task