芹菜执行者的气流dag死锁

时间:2017-06-06 23:03:56

标签: python celery airflow

使用celery执行器时,我有一个死锁:

https://www.dropbox.com/s/mfxqhawwf0760gm/Screenshot%202017-05-06%2019.14.06.png?dl=0

使用celeryexecutor时的最后一个日志表明dag已经死锁,可以在这里找到: https://github.com/apache/incubator-airflow/blob/master/airflow/models.py#L4250-L4253

相同的dag确实在顺序执行器中正确传播失败。

下面是我用来重现问题的示例dag。我通过使用不存在的关键字参数强制dataops_weekly_update_reviews任务中的失败。

之前有没有人遇到过这个问题?

```

import airflow
import datetime
from airflow.operators.python_operator import PythonOperator
from airflow.models import DAG

args = {
    'owner': 'airflow',
    'start_date': datetime.datetime(2017, 5, 5),
    'queue': 'development'
}

dag = DAG(
    dag_id='example_dataops_weekly_reviews', default_args=args,
    schedule_interval=None)


def instantiate_emr_cluster(*args, **kwargs):
    return "instantiating emr cluster"

task_instantiate_emr_cluster = PythonOperator(
    task_id="instantiate_emr_cluster",
    python_callable=instantiate_emr_cluster,
    provide_context=True,
    dag=dag)


def initialize_tables(*args, **kwargs):
    return "initializing tables {}".format(kwargs["ds"])


task_initialize_tables = PythonOperator(
    task_id="initialize_tables",
    python_callable=initialize_tables,
    provide_context=True,
    dag=dag)


def dataops_weekly_update_reviews(*args, **kwargs):
    return "UPDATING weekly reviews {}".format(kwargs["dsasdfdsfa"])


task_dataops_weekly_update_reviews = PythonOperator(
    task_id="dataops_weekly_update_reviews",
    python_callable=dataops_weekly_update_reviews,
    on_failure_callback=airflow.models.TaskInstance.handle_failure("dataops branching error"),
    provide_context=True,
    dag=dag)


def load_dataops_reviews(*args, **kwargs):
    return "loading dataops reviews"


task_load_dataops_reviews = PythonOperator(
    task_id="load_dataops_reviews",
    python_callable=load_dataops_reviews,
    provide_context=True,
    dag=dag)


def load_dataops_surveys(**kwargs):
    return "Print out the running EMR cluster"


task_load_dataops_surveys = PythonOperator(
    task_id="load_dataops_surveys",
    provide_context=True,
    python_callable=load_dataops_surveys,
    dag=dag)


def load_cs_survey_answers(**kwargs):
    return "load cs survey answers"


task_load_cs_survey_answers = PythonOperator(
    task_id="load_cs_survey_answers",
    provide_context=True,
    python_callable=load_cs_survey_answers,
    dag=dag)


def terminate_emr_cluster(*args, **kwargs):
    return "terminate emr cluster"


task_terminate_emr_cluster = PythonOperator(
    task_id="terminate_emr_cluster",
    python_callable=terminate_emr_cluster,
    provide_context=True,
    trigger_rule="all_done",
    dag=dag)


task_initialize_tables.set_upstream(task_instantiate_emr_cluster)
task_dataops_weekly_update_reviews.set_upstream(task_initialize_tables)
task_load_dataops_reviews.set_upstream(task_dataops_weekly_update_reviews)
task_terminate_emr_cluster.set_upstream(task_load_dataops_reviews)
task_load_dataops_surveys.set_upstream(task_dataops_weekly_update_reviews)
task_terminate_emr_cluster.set_upstream(task_load_dataops_surveys)
task_load_cs_survey_answers.set_upstream(task_dataops_weekly_update_reviews)
task_terminate_emr_cluster.set_upstream(task_load_cs_survey_answers)

```

0 个答案:

没有答案