我正在尝试使用prometheus,grafana和blackbox_exporter通过ping(icmp)来监控我的服务器。我们最近面临网络不稳定,但我的设置无法显示警报。我想知道我应该使用'probe_duration_seconds'还是其他探测方法?
grfana设置指标:probe_duration_seconds和面板数据源是prometheus
blackbox.yml:
modules:
icmp:
prober: icmp
timeout: 5s
icmp:
protocol: "icmp"
preferred_ip_protocol: "ip4"
prometheus.yml:
global:
scrape_interval: 15s
external_labels:
monitor: 'codelab-monitor'
scrape_configs:
- job_name: 'blackbox'
scrape_interval: 5s
metrics_path: /probe
params:
module: [icmp] #ping
static_configs:
- targets: ['192.168.1.29']
labels:
group: 'env A'
- targets: ['192.168.2.185', '192.168.3.185', '192.168.4.185']
labels:
group: 'env B'
relabel_configs:
- source_labels: [__address__]
regex: (.*)(:80)?
target_label: __param_target
replacement: ${1}
- source_labels: [__param_target]
regex: (.*)
target_label: instance
replacement: ${1}
- source_labels: []
regex: .*
target_label: __address__
replacement: 127.0.0.1:9115
答案 0 :(得分:1)
probe_success
将为1/0,具体取决于ping是否成功。