如何扩展PythonOperator

时间:2019-05-15 12:02:58

标签: python airflow

我正在尝试自定义我的PythonOperator并将其放在$ AIRFLOW_HOME / plugins下,如下所示:



    class MyPythonOperator(PythonOperator):

        def my_callable(param1, param2, param3):
            # do something

        @apply_defaults
        def __init__(self, task_id, *args, **kwargs):

            super(MyPythonOperator, self).__init__(
                task_id=task_id,  
                python_callable = self.my_callable,
                provide_context = True,
                *args, **kwargs)

然后,我定义一个气流dag代码,这非常简单,只需执行两个任务即可:



    args = {
        'owner': 'airflow',
        'start_date': airflow.utils.dates.days_ago(2),
    }

    dag = DAG(
        dag_id='example_workflow',
        default_args=args,
        schedule_interval='0 0 * * *',
        dagrun_timeout=timedelta(minutes=60),
    )


    task1 = MyPythonOperator(
        task_id='task1',
        params={'param1': 'param1_value',
                'param2': 'param2_value',
                'param3': 'param3_value'},
        dag=dag
    )

    task2 = MyPythonOperator(
        task_id='task2',
        params={'param1': 'param1_value',
                'param2': 'param2_value',
                'param2': 'param3_value'},
        dag=dag
    )

    task1 >> task2

但是我运行dag python代码后,得到错误消息:


$ python example_airflow_code.py
[2019-05-15 19:51:10,338] {init.py:51} INFO - Using executor SequentialExecutor
usage: example_airflow_code.py [-h]
                               {list_tasks,backfill,test,run,pause,unpause,list_dag_runs}
                               ...
example_airflow_code.py: error: too few arguments

我尝试了一些调试,并在此行插入一个断点:


super(MyPythonOperator, self).init(

我在调用超级构造函数之前发现,self.dag和self.dag_id的值异常,值为:


str: Traceback (most recent call last):
  File "/Applications/Eclipse.app/Contents/Eclipse/plugins/org.python.pydev.core_6.4.4.201807281807/pysrc/_pydevd_bundle/pydevd_resolver.py", line 166, in _getPyDictionary
    attr = getattr(var, n)
  File "/Users/zhuangxy/anaconda2/lib/python2.7/site-packages/airflow/models/init.py", line 2399, in dag_id
    return 'adhoc_' + self.owner
AttributeError: 'MyPythonOperator' object has no attribute 'owner'

任何人都知道这个例子有什么问题吗? 非常感谢你!

1 个答案:

答案 0 :(得分:0)

我最近也遇到了这个。看来您缺少自定义PythonOperator上的context参数。

更改方法定义,使其看起来像这样:

def my_callable(param1, param2, param3, **context):
    # do something

失败的原因是您在操作员中提供的provide_context=True标志。由于某种原因,可调用的python正在您的参数中寻找它。