airflow 1.10.0 branchpythonoperator运行失败:芹菜命令失败

时间:2018-10-12 07:27:10

标签: airflow

我将气流dag示例example_branch_dop_operator_v3代码复制到我自己的dag test1_v2,我可以成功运行example_branch_dop_operator_v3,但是运行test1_v2失败。 dag test1_v2代码(AIRFLOW_HOME / dags / test1.py):

import airflow
from airflow.operators.python_operator import BranchPythonOperator
from airflow.operators.dummy_operator import DummyOperator
from airflow.models import DAG

args = {
    'owner': 'airflow',
    'start_date': airflow.utils.dates.days_ago(2),
    'depends_on_past': True,
}

dag = DAG(dag_id='test1_v2'
          schedule_interval='*/1 * * * *', default_args=args)


def should_run(ds, **kwargs):

    print('------------- exec dttm = {} and minute = {}'.
          format(kwargs['execution_date'], kwargs['execution_date'].minute))
    if kwargs['execution_date'].minute % 2 == 0:
        return "oper_1"
    else:
        return "oper_2"


cond = BranchPythonOperator(
    task_id='condition',
    provide_context=True,
    python_callable=should_run,
    dag=dag)

oper_1 = DummyOperator(
    task_id='oper_1',
    dag=dag)
oper_1.set_upstream(cond)

oper_2 = DummyOperator(
    task_id='oper_2',
    dag=dag)
oper_2.set_upstream(cond)

命令airflow run test1_v2 condition "2018-09-01 00:00:00"中,有工作日志:

[2018-10-11 21:20:29,991] {cli.py:492}信息-在主机CenT上运行
[2018-10-11 21:23:10,879] {settings.py:174}信息-setting.configure_orm():使用池设置。 pool_size = 5,pool_recycle = 1800
[2018-10-11 21:23:11,343] { init .py:51}信息-使用执行程序CeleryExecutor
[2018-10-11 21:23:11,572] {cli.py:478}信息-加载腌制ID 26
追溯(最近一次通话):
  
中的文件“ / home / airflow / airflow / venv / bin / airflow”,第32行     args.func(args)
  包装中的文件“ /home/airflow/airflow/venv/lib/python3.5/site-packages/airflow/utils/cli.py”,第74行
    返回f(* args,** kwargs)
  在运行中文件“ /home/airflow/airflow/venv/lib/python3.5/site-packages/airflow/bin/cli.py”,第480行
    DagPickle).filter(DagPickle.id == args.pickle).first()
  首先在文件“ /home/airflow/airflow/venv/lib/python3.5/site-packages/sqlalchemy/orm/query.py”中行2755
    ret = list(self [0:1])
   getitem
中的文件“ /home/airflow/airflow/venv/lib/python3.5/site-packages/sqlalchemy/orm/query.py”,第2547行     返回清单(res)
  在实例中,文件“ /home/airflow/airflow/venv/lib/python3.5/site-packages/sqlalchemy/orm/loading.py”,第90行,
    util.raise_from_cause(err)
  在raise_from_cause中的文件“ /home/airflow/airflow/venv/lib/python3.5/site-packages/sqlalchemy/util/compat.py”,第203行,
    reraise(类型(异常),异常,tb = exc_tb,原因=原因)
  重新列出文件“ /home/airflow/airflow/venv/lib/python3.5/site-packages/sqlalchemy/util/compat.py”,第187行
    提高价值
  在实例中,文件“ /home/airflow/airflow/venv/lib/python3.5/site-packages/sqlalchemy/orm/loading.py”,第75行,
    rows = [proc(row)for row in fetch]
  
中的文件“ /home/airflow/airflow/venv/lib/python3.5/site-packages/sqlalchemy/orm/loading.py”,第75行     rows = [proc(row)for row in fetch]
  _instance中的文件“ /home/airflow/airflow/venv/lib/python3.5/site-packages/sqlalchemy/orm/loading.py”,第452行
    loading_instance,populate_existing,填充器)
  _populate_full
中的文件“ /home/airflow/airflow/venv/lib/python3.5/site-packages/sqlalchemy/orm/loading.py”,行513     dict_ [key] = getter(row)
  文件“ /home/airflow/airflow/venv/lib/python3.5/site-packages/sqlalchemy/sql/sqltypes.py”,行1540,正在处理中
    返回载荷(值)
  载入中的文件“ /home/airflow/airflow/venv/lib/python3.5/site-packages/dill/_dill.py”,第316行
    返回负载(文件,忽略)
  载入中的文件“ /home/airflow/airflow/venv/lib/python3.5/site-packages/dill/_dill.py”,第304行
    obj = pik.load()
  在find_class中的文件“ /home/airflow/airflow/venv/lib/python3.5/site-packages/dill/_dill.py”,行
    返回StockUnpickler.find_class(自身,模块,名称)
ImportError:没有名为“ unusual_prefix_d47cb71ac291be245f60c8ac0070d906f4627fa1_test1”的模块
[2018-10-11 21:23:11,823:错误/ ForkPoolWorker-6] execute_command遇到CalledProcessError
追溯(最近一次通话):
  在execute_command
的第60行中输入文件“ /home/airflow/airflow/venv/lib/python3.5/site-packages/airflow/executors/celery_executor.py”
    close_fds = True,env = env)
  在check_call中的文件“ /data/python35/lib/python3.5/subprocess.py”,第271行
    引发CalledProcessError(retcode,cmd)
subprocess.CalledProcessError:命令'气流运行test1_v1条件2018-09-01T10:00:00 + 08:00 --pickle 26 --local'返回非零退出状态1
[2018-10-11 21:23:11,895:错误/ ForkPoolWorker-6]无
[2018-10-11 21:23:12,103:错误/ ForkPoolWorker-6]任务airflow.executors.celery_executor.execute_command [efb4ef09-bdf8-4123-85c8-4dc73dc19d74]引发意外:AirflowException('Celery命令失败',)< br /> 追溯(最近一次通话):
  在execute_command
的第60行中输入文件“ /home/airflow/airflow/venv/lib/python3.5/site-packages/airflow/executors/celery_executor.py”
    close_fds = True,env = env)
  在check_call中的文件“ /data/python35/lib/python3.5/subprocess.py”,第271行
    引发CalledProcessError(retcode,cmd)
subprocess.CalledProcessError:命令'气流运行test1_v1条件2018-09-01T10:00:00 + 08:00 --pickle 26 --local'返回非零退出状态1

在处理上述异常期间,发生了另一个异常:

回溯(最近通话最近):
  在trace_task
中,文件“ /home/airflow/airflow/venv/lib/python3.5/site-packages/celery/app/trace.py”,行375     R = retval = fun(* args,** kwargs)
   protected_call
中的文件“ /home/airflow/airflow/venv/lib/python3.5/site-packages/celery/app/trace.py”,第632行     返回self.run(* args,** kwargs)
  在execute_command中的文件“ /home/airflow/airflow/venv/lib/python3.5/site-packages/airflow/executors/celery_executor.py”,第65行
    引发AirflowException('Celery命令失败')
airflow.exceptions.AirflowException:芹菜命令失败

为什么dag test2_v1会失败?谢谢。

1 个答案:

答案 0 :(得分:1)

当我使用python_callable=range替换python_callable=should_run时,请成功运行此dag,因此我想原因是气流无法找到should_run,如日志ImportError: No module named 'unusual_prefix_d47cb71ac291be245f60c8ac0070d906f4627fa1_test1'中所示

解决方法是:

  • 如果您使用命令,则应使用airflow backfill test1_v2 -s 20180901 -e 20180902 -x documentation
  • 在气流调度器触发的情况下没有这种问题