是否可以避免为已触发的警报发送重复的Slack通知?

时间:2019-07-11 15:12:30

标签: kubernetes prometheus prometheus-alertmanager

免责声明:我是第一次使用Prometheus。

我试图在每次作业成功结束时发送Slack通知。

为此,我安装了kube-state-metrics,Prometheus和AlertManager。

然后我创建了以下规则:

rules:
  - alert: KubeJobCompleted
    annotations:    
      identifier: '{{ $labels.instance }}'
      summary: Job Completed Successfully
      description: Job *{{ $labels.namespace }}/{{ $labels.job_name }}* is completed successfully.
    expr: |
      kube_job_spec_completions{job="kube-state-metrics"} - kube_job_status_succeeded{job="kube-state-metrics"}  == 0
    labels:
      severity: information

并添加了AlertManager接收器文本(模板):

{{ define "custom_slack_message" }}
{{ range .Alerts }}
    {{ .Annotations.description }}
{{ end }} 
{{ end }} 

我当前的结果:每当新作业成功完成时,我都会收到一条Slack通知,其中列出了所有成功完成的作业。

起初我不介意接收整个列表,但此后我希望接收仅包含指定组间隔中新完成的作业的通知。

有可能吗?

2 个答案:

答案 0 :(得分:1)

只需添加额外的规则即可显示最后完成的任务:

行:for: <10m>-将在10分钟内列出最后完成的任务:

rules:
  - alert: KubeJobCompleted
    annotations:    
      identifier: '{{ $labels.instance }}'
      summary: Job Completed Successfully
      description: Job *{{ $labels.namespace }}/{{ $labels.job_name }}* is completed successfully.
    expr: |
      kube_job_spec_completions{job="kube-state-metrics"} - kube_job_status_succeeded{job="kube-state-metrics"}  == 0
      for: 10m
    labels:
      severity: information

答案 1 :(得分:0)

我最终使用了 kube_job_status_completion_time time()来消除过去的事件(避免在重复时间触发事件)。

rules:
  - alert: KubeJobCompleted
    annotations:    
      identifier: '{{ $labels.instance }}'
      summary: Job Completed Successfully
      description: Job *{{ $labels.namespace }}/{{ $labels.job_name }}* is completed successfully.
    expr: |
      time() - kube_job_status_completion_time < 60 and kube_job_spec_completions{job="kube-state-metrics"} - kube_job_status_succeeded{job="kube-state-metrics"}  == 0
    labels:
      severity: information