对于不好的标题名称,我深表歉意。
我正在尝试自动执行电子邮件任务。我需要发送与特定过程关联的电子邮件。我不知道如何正确遍历数据框。我不想结束每个任务的电子邮件,只是过程。
我需要分组吗?我真的很困惑如何做到这一点。
Process ID Task Execution Date Execution Time Start Date Start Time End Date End Time Status Emails Process Status
0 A 1 8/7/2019 1:00 PM 8/7/2019 1:00 PM 8/7/2019 1:05 PM Success NaN Successful
1 A 2 8/7/2019 1:05 PM 8/7/2019 1:05 PM 8/7/2019 1:10 PM Success NaN Successful
2 A 3 8/7/2019 1:10 PM 8/7/2019 1:10 PM 8/7/2019 1:15 PM Success ['user1@gmail.com'] Successful
3 B 1 8/7/2019 2:00 PM 8/7/2019 2:00 PM 8/7/2019 2:05 PM Success NaN FAILED
4 B 2 8/7/2019 2:05 PM 8/7/2019 2:05 PM 8/7/2019 2:10 PM Success ['user2@gmail.com'] FAILED
5 B 3 8/7/2019 2:10 PM 8/7/2019 2:10 PM 8/7/2019 2:15 PM FAILED NaN FAILED
for process in df['Process ID'].unique():
print(df['Execution Date'])
msg = print(Process ID + ‘was’ + DAG Status. Process ID + ‘was completed on’ + End Date (on task 3) + at End Time (on task 3)).
server = smtplib.SMTP('sever.com')
server.sendmail(
'my_email@gmail.com',
df.['Emails'],
msg)
server.quit()
所需的输出将是一封发送状态说明的电子邮件。
A was Succesful. A was completed on 8/7/2019 at 1:15pm.
B was FAILED. B was completed on 8/7/2019 at 2:15pm
答案 0 :(得分:0)
我如何解释您的问题,实际上是您希望从混乱的数据中获取每个进程ID带有标志all(result_set in [True, False])
的级别。然后,您想要获取这些数据并对其进行处理。
要达到该汇总级别,请执行以下操作:
df['process_failed'] = df['Process Status'] == 'FAILED' # consider case insensitive
result_set = df.groupby('Process ID')['process_failed'].any().rename('any_process_id_failed')
我之所以选择生成一个中间变量,是因为GroupBy访问器不能很好地使用字符串(缺少一些应用调用)。
从此输出中,循环遍历它们中的每一个,以执行制作电子邮件所需的任何操作。熊猫不是电子邮件客户端,除非真正长期使用Zawinski法则,因此这不在您的问题范围内。