我在AWS实例上运行Kubernetes集群,并在kubernetes中运行prometheus进行监控。有三个在kubernetes外部运行的etcd服务器,我正在尝试使用prometheus监视etcd的健康状况。
Prometheus作为状态集部署,并具有kubelet,节点导出器及其自身的指标。但是,我无法从etcd获得任何指标。
以下是prometheus配置的相关部分:
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus
namespace: monitoring
data:
prometheus.yml: |-
global:
scrape_interval: 30s
evaluation_interval: 30s
rule_files:
- /etc/alertmanager/*.rules
scrape_configs:
- job_name: etcd
scheme: https
static_configs:
- targets: ['x.x.x.x:2379']
tls_config:
ca_file: /etc/etcd/ssl/ca.pem
cert_file: /etc/etcd/ssl/client.pem
key_file: /etc/etcd/ssl/client-key.pem
insecure_skip_verify: true
- job_name: kubelets
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
insecure_skip_verify: true
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
这是我在Prometheus仪表板中遇到的错误:
Get https://x.x.x.x.:2379/metrics: x509: cannot validate certificate for x.x.x.x because it doesn't contain any IP SANs
证书是自签名的,但不应该" insecure_skip_verify"照顾好吗?
答案 0 :(得分:0)
要消除etcd问题,如果您正在使用etcd3,则可以将以下参数与etcd客户端etcdctl
一起使用,并使用https://github.com/coreos/etcd/blob/master/Documentation/dev-guide/interacting_v3.md中的步骤与etcd服务器进行交互。如果它没有错误,我会说这是一个prometheus问题,因为没有尊重insecure_skip_verify: true
配置。
--insecure-skip-tls-verify=true skip server certificate verification
--insecure-transport=true disable transport security for client connections