我正在尝试清除失败的任务,以便它再次运行。
我通常使用树形视图中的Web GUI进行此操作
选择"清除"我被引导到一个错误页面:
此页面上的回溯与尝试使用CLI清除此任务时收到的错误相同:
[u@airflow01 ~]# airflow clear -s 2002-07-29T20:25:00 -t
coverage_check gom_modis_aqua_coverage_check
[2018-01-16 16:21:04,235] {__init__.py:57} INFO - Using executor CeleryExecutor
[2018-01-16 16:21:05,192] {models.py:167} INFO - Filling up the DagBag from /root/airflow/dags
Traceback (most recent call last):
File "/usr/bin/airflow", line 28, in <module>
args.func(args)
File "/usr/lib/python3.4/site-packages/airflow/bin/cli.py", line 612, in clear
include_upstream=args.upstream,
File "/usr/lib/python3.4/site-packages/airflow/models.py", line 3173, in sub_dag
dag = copy.deepcopy(self)
File "/usr/lib64/python3.4/copy.py", line 166, in deepcopy
y = copier(memo)
File "/usr/lib/python3.4/site-packages/airflow/models.py", line 3159, in __deepcopy__
setattr(result, k, copy.deepcopy(v, memo))
File "/usr/lib64/python3.4/copy.py", line 155, in deepcopy
y = copier(x, memo)
File "/usr/lib64/python3.4/copy.py", line 246, in _deepcopy_dict
y[deepcopy(key, memo)] = deepcopy(value, memo)
File "/usr/lib64/python3.4/copy.py", line 166, in deepcopy
y = copier(memo)
File "/usr/lib/python3.4/site-packages/airflow/models.py", line 2202, in __deepcopy__
setattr(result, k, copy.deepcopy(v, memo))
File "/usr/lib64/python3.4/copy.py", line 155, in deepcopy
y = copier(x, memo)
File "/usr/lib64/python3.4/copy.py", line 246, in _deepcopy_dict
y[deepcopy(key, memo)] = deepcopy(value, memo)
File "/usr/lib64/python3.4/copy.py", line 182, in deepcopy
y = _reconstruct(x, rv, 1, memo)
File "/usr/lib64/python3.4/copy.py", line 309, in _reconstruct
y.__dict__.update(state)
AttributeError: 'NoneType' object has no attribute 'update'
寻找有关可能导致此问题的想法,我应该采取哪些措施来解决此问题,以及如何在将来避免这种情况。
我能够通过使用&#34;浏览&gt;删除任务记录来解决此问题。任务实例&#34;搜索,但仍然希望探索这个问题,因为我多次看到这个问题。
虽然我的DAG代码变得越来越复杂,但这里摘录的是dag中运算符的定义:
trigger_granule_dag_id = 'trigger_' + process_pass_dag_name
coverage_check = BranchPythonOperator(
task_id='coverage_check',
python_callable=_coverage_check,
provide_context=True,
retries=10,
retry_delay=timedelta(hours=3),
queue=QUEUE.PYCMR,
op_kwargs={
'roi':region,
'success_branch_id': trigger_granule_dag_id
}
)
可以在github/USF-IMARS/imars_dags浏览完整的源代码。以下是最相关部分的链接:
答案 0 :(得分:1)
下面是我创建的示例DAG,用于模仿您所面临的错误。
import logging
import os
from datetime import datetime, timedelta
import boto3
from airflow import DAG
from airflow import configuration as conf
from airflow.operators import ShortCircuitOperator, PythonOperator, DummyOperator
def athena_data_validation(**kwargs):
pass
start_date = datetime.now()
args = {
'owner': 'airflow',
'start_date': start_date,
'depends_on_past': False,
'wait_for_downstream': False,
'email_on_failure': False,
'email_on_retry': False,
'retries': 1,
'retry_delay': timedelta(seconds=30)
}
dag_name = 'data_validation_dag'
schedule_interval = None
dag = DAG(
dag_id=dag_name,
default_args=args,
schedule_interval=schedule_interval)
athena_client = boto3.client('athena', region_name='us-west-2')
DAG_SCRIPTS_DIR = conf.get('core', 'DAGS_FOLDER') + "/data_validation/"
start_task = DummyOperator(task_id='Start_Task', dag=dag)
end_task = DummyOperator(task_id='End_Task', dag=dag)
data_validation_task = ShortCircuitOperator(
task_id='Data_Validation',
provide_context=True,
python_callable=athena_data_validation,
op_kwargs={
'athena_client': athena_client,
'sql_file': DAG_SCRIPTS_DIR + 'data_validation.sql',
's3_output_path': 's3://XXX/YYY/'
},
dag=dag)
data_validation_task.set_upstream(start_task)
data_validation_task.set_downstream(end_task)
成功运行一次后,我尝试清除Data_Validation
任务并得到同样的错误(见下文)。
我删除了athena_client
对象创建并将其放在athena_data_validation
函数中,然后它就可以了。因此,当我们在Airflow UI中执行clear
时,它会尝试执行deepcopy
并获取之前运行的所有对象。我仍然试图理解为什么它无法获得object
类型的副本,但我得到了一个适合我的解决方法。
答案 1 :(得分:1)
在某些操作中,Airflow会深层复制某些对象。不幸的是,有些对象不允许这样做。 boto客户端是一个不能很好地深度复制的东西的好例子,线程对象是另一个,但是具有嵌套引用的大对象(如下面对父任务的引用)也会导致问题。
通常,您不希望在dag代码本身中实例化客户端。也就是说,我不认为这是你的问题。虽然我无法访问pyCMR代码以查看它是否存在问题。