减少普罗米修斯警报中的内容

时间:2020-11-04 05:12:11

标签: kubernetes prometheus prometheus-alertmanager prometheus-node-exporter

我有kube状态指标,prometheus和alertmanager部署了VM。我将prometheus和alertmanager配置为在重新启动次数在一定时间内增加一定值时获取警报。一切正常。但是,大量不必要的数据将作为警报的一部分来临。基本上,我不希望在普罗米修斯身上看到的所有标签都成为戒备的一部分。

目前我正在接收的内容:

alertname = RestartsAlerts
container = kube-state-metrics
endpoint = http
exported_container = kube-scheduler
exported_namespace = kube-system
....
alertname = RestartsAlerts
container = kube-state-metrics
endpoint = http
exported_container = kube-scheduler
exported_namespace = kube-system

警报配置:

- name: Pod-Restarts
    rules:
      - alert: RestartsAlerts
        expr: max_over_time(kube_pod_container_status_restarts_total[3m]) - min_over_time(kube_pod_container_status_restarts_total[3m]) > 1
        labels:
          severity: critical
        annotations:
          summary: "More than 1 restart in pod {{ $labels.exported_pod }}"
          description: "{{ $labels.exported_container }} container has restarted {{ $value }} times.\n Instance: {{ $labels.instance }}"

0 个答案:

没有答案