我在Prometheus中创建了用于pod内存使用的警报规则。警报在我的闲置频道中完美显示,但是其中没有Pod的名称,因此很难理解哪个Pod出现了问题。
它只是显示[FIRING:35] (POD_MEMORY_HIGH_UTILIZATION default/k8s warning)
。但是,当我查看Prometheus UI中的“警报”部分时,可以看到带有其pod名称的触发规则。有人可以帮忙吗?
我的警报通知模板如下:
alertname: TargetDown
alertname: POD_CPU_HIGH_UTILIZATION
alertname: POD_MEMORY_HIGH_UTILIZATION
receivers:
- name: 'slack-notifications'
slack_configs:
- channel: '#devops'
title: '{{ .CommonAnnotations.summary }}'
text: '{{ .CommonAnnotations.description }}'
send_resolved: true
我在警报通知模板中添加了选项title: '{{ .CommonAnnotations.summary }}' text: '{{ .CommonAnnotations.description }}'
,现在它显示了说明。我的描述是description: pod {{$labels.pod}} is using high memory
。但仅显示is using high memory
。没有指定广告连播名称
答案 0 :(得分:0)
如article中所述,您应检查警报规则并在必要时进行更新。查看示例:
ALERT ElasticacheCPUUtilisation
IF aws_elasticache_cpuutilization_average > 80
FOR 10m
LABELS { severity = "warning" }
ANNOTATIONS {
summary = "ElastiCache CPU Utilisation Alert",
description = "Elasticache CPU Usage has breach the threshold set (80%) on cluster id {{ $labels.cache_cluster_id }}, now at {{ $value }}%",
runbook = "https://mywiki.com/ElasticacheCPUUtilisation",
}
要为prometheus GUI提供外部URL,请将CLI参数应用于prometheus服务器并重新启动它:
-web.external-url=http://externally-available-url:9090/
之后,您可以将这些值放入Alertmanager配置中。查看示例:
receivers:
- name: 'iw-team-slack'
slack_configs:
- channel: alert-events
send_resolved: true
api_url: https://hooks.slack.com/services/<your_token>
title: '[{{ .Status | toUpper }}{{ if eq .Status "firing" }}:{{ .Alerts.Firing | len }}{{ end }}] Monitoring Event Notification'
text: >-
{{ range .Alerts }}
*Alert:* {{ .Annotations.summary }} - `{{ .Labels.severity }}`
*Description:* {{ .Annotations.description }}
*Graph:* <{{ .GeneratorURL }}|:chart_with_upwards_trend:> *Runbook:* <{{ .Annotations.runbook }}|:spiral_note_pad:>
*Details:*
{{ range .Labels.SortedPairs }} • *{{ .Name }}:* `{{ .Value }}`
{{ end }}
{{ end }}