在DAG中是否有任何选项“自定义电子邮件并发送任务失败”。有一个类似'email_on_failure'的选项:True,但这不提供将内容动态添加到电子邮件主题或正文的选项。
我的DAG如下所示
import airflow
from airflow import DAG
from airflow.contrib.operators.databricks_operator import DatabricksSubmitRunOperator
from airflow.operators.email_operator import EmailOperator
from airflow.operators.bash_operator import BashOperator
from airflow.operators.http_operator import SimpleHttpOperator
from airflow.operators.sensors import HttpSensor
import json
from datetime import timedelta
from datetime import datetime
from airflow.models import Variable
args = {
'owner': 'airflow',
'email': ['test@gmail.com'],
'email_on_failure': True,
'email_on_retry': True,
'depends_on_past': False,
'start_date': airflow.utils.dates.days_ago(0),
'max_active_runs':10
}
dag = DAG(dag_id='TEST_DAG', default_args=args, schedule_interval='@once')
new_cluster = {
'spark_version': '4.0.x-scala2.11',
'node_type_id': 'Standard_D16s_v3',
'num_workers': 3,
'spark_conf':{
'spark.hadoop.javax.jdo.option.ConnectionDriverName':'org.postgresql.Driver',
.....
},
'custom_tags':{
'ApplicationName':'TEST',
.....
}
}
t1 = DatabricksSubmitRunOperator(
task_id='t1',
dag=dag,
new_cluster=new_cluster,
......
)
t2 = SimpleHttpOperator(
task_id='t2',
method='POST',
........
)
t2.set_upstream(t1)
t3 = SimpleHttpOperator(
task_id='t3',
method='POST',
.....
)
t3.set_upstream(t2)
send_mail = EmailOperator (
dag=dag,
task_id="send_mail",
to=["test@gmail.com"],
subject=" Success",
html_content='<h3>Success</h3>')
send_mail.set_upstream(t3)
成功案例send_mail任务会将定制的电子邮件发送到指定的电子邮件ID。
但是如果万一任务失败,我想自定义电子邮件并发送到指定的电子邮件ID。但这是没有发生的,在失败的情况下,使用默认主题和正文发送电子邮件
任何帮助将不胜感激
答案 0 :(得分:2)
我为此使用on_failure_callback
。请注意,它将在DAG中的每个失败任务中触发。
def report_failure(context):
# include this check if you only want to get one email per DAG
if(task_instance.xcom_pull(task_ids=None, dag_id=dag_id, key=dag_id) == True):
logging.info("Other failing task has been notified.")
send_email = EmailOperator(...)
send_email.execute(context)
'''
dag = DAG(
...,
default_args={
...,
"on_failure_callback": report_failure
}
)
答案 1 :(得分:2)
我借助Airflow TriggerRule(以下示例DAG)进行了管理:-
import airflow
from airflow import DAG
from airflow.contrib.operators.databricks_operator import DatabricksSubmitRunOperator
from airflow.operators.email_operator import EmailOperator
from airflow.operators.bash_operator import BashOperator
from airflow.operators.http_operator import SimpleHttpOperator
from airflow.operators.sensors import HttpSensor
import json
from datetime import timedelta
from datetime import datetime
from airflow.models import Variable
from airflow.utils.trigger_rule import TriggerRule
args = {
'owner': 'airflow',
'email': ['test@gmail.com'],
'email_on_failure': True,
'email_on_retry': True,
'depends_on_past': False,
'start_date': airflow.utils.dates.days_ago(0),
'max_active_runs':10
}
dag = DAG(dag_id='TEST_DAG', default_args=args, schedule_interval='@once')
new_cluster = {
'spark_version': '4.0.x-scala2.11',
'node_type_id': 'Standard_D16s_v3',
'num_workers': 3,
'spark_conf':{
'spark.hadoop.javax.jdo.option.ConnectionDriverName':'org.postgresql.Driver',
.....
},
'custom_tags':{
'ApplicationName':'TEST',
.....
}
}
t1 = DatabricksSubmitRunOperator(
task_id='t1',
dag=dag,
new_cluster=new_cluster,
......
)
t2 = SimpleHttpOperator(
task_id='t2',
trigger_rule=TriggerRule.ONE_SUCCESS,
method='POST',
........
)
t2.set_upstream(t1)
t3 = SimpleHttpOperator(
task_id='t3',
trigger_rule=TriggerRule.ONE_SUCCESS,
method='POST',
.....
)
t3.set_upstream(t2)
AllTaskSuccess = EmailOperator (
dag=dag,
trigger_rule=TriggerRule.ALL_SUCCESS,
task_id="AllTaskSuccess",
to=["test@gmail.com"],
subject="All Task completed successfully",
html_content='<h3>All Task completed successfully" </h3>')
AllTaskSuccess.set_upstream([t1, t2,t3])
t1Failed = EmailOperator (
dag=dag,
trigger_rule=TriggerRule.ONE_FAILED,
task_id="t1Failed",
to=["test@gmail.com"],
subject="T1 Failed",
html_content='<h3>T1 Failed</h3>')
t1Failed.set_upstream([t1])
t2Failed = EmailOperator (
dag=dag,
trigger_rule=TriggerRule.ONE_FAILED,
task_id="t2Failed",
to=["test@gmail.com"],
subject="T2 Failed",
html_content='<h3>T2 Failed</h3>')
t2Failed.set_upstream([t2])
t3Failed = EmailOperator (
dag=dag,
trigger_rule=TriggerRule.ONE_FAILED,
task_id="t3Failed",
to=["test@gmail.com"],
subject="T3 Failed",
html_content='<h3>T3 Failed</h3>')
t3Failed.set_upstream([t3])
触发规则
尽管正常的工作流程行为是在所有直接上游任务都成功后触发任务,但是Airflow允许进行更复杂的依赖项设置。
所有运算符都有一个trigger_rule参数,该参数定义触发生成的任务所依据的规则。 trigger_rule的默认值为all_success,可以定义为“当所有直接上游任务都成功时触发此任务”。这里描述的所有其他规则都是基于直接父任务,并且是在创建任务时可以传递给任何运算符的值:
所有成功:(默认)所有父母都成功
all_failed:所有父母处于失败或上游失败状态
all_done:所有父母都已执行死刑
one_failed:至少有一位父母发生故障时立即触发,它不会等待所有父母完成工作
one_success:至少有一位父母成功后就触发,它不会等待所有父母都完成
虚拟:依赖只是为了显示,可以随意触发
答案 2 :(得分:0)
当前使用的是Airflow 1.10.1:
使用以下jinja模板,似乎可以在airflow.cfg中的“电子邮件”部分下配置自定义电子邮件选项:
[email]
email_backend = airflow.utils.email.send_email_smtp
subject_template = /path/to/my_subject_template_file
html_content_template = /path/to/my_html_content_template_file
可以通过使用html_content_template中的任务实例信息来创建自定义消息,而html_content_template又是一个Jinja模板
更多详细信息,请访问https://airflow.apache.org/docs/stable/howto/email-config.html