普罗米修斯不能刮外部的等待

时间:2017-10-04 18:10:50

标签: amazon-web-services kubernetes monitoring prometheus etcd

我在AWS实例上运行Kubernetes集群,并在kubernetes中运行prometheus进行监控。有三个在kubernetes外部运行的etcd服务器,我正在尝试使用prometheus监视etcd的健康状况。

Prometheus作为状态集部署,并具有kubelet,节点导出器及其自身的指标。但是,我无法从etcd获得任何指标。

以下是prometheus配置的相关部分:

apiVersion: v1
kind: ConfigMap
metadata:
   name: prometheus
   namespace: monitoring
   data:
   prometheus.yml: |-
global:
  scrape_interval: 30s
  evaluation_interval: 30s

rule_files:
- /etc/alertmanager/*.rules

scrape_configs:

- job_name: etcd
  scheme: https
  static_configs:
  - targets: ['x.x.x.x:2379']
  tls_config:
     ca_file: /etc/etcd/ssl/ca.pem
     cert_file: /etc/etcd/ssl/client.pem
     key_file: /etc/etcd/ssl/client-key.pem
     insecure_skip_verify: true

- job_name: kubelets
  scheme: https
  tls_config:
    ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
    insecure_skip_verify: true
  bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token

这是我在Prometheus仪表板中遇到的错误:

Get https://x.x.x.x.:2379/metrics: x509: cannot validate certificate for x.x.x.x because it doesn't contain any IP SANs

证书是自签名的,但不应该" insecure_skip_verify"照顾好吗?

1 个答案:

答案 0 :(得分:0)

要消除etcd问题,如果您正在使用etcd3,则可以将以下参数与etcd客户端etcdctl一起使用,并使用https://github.com/coreos/etcd/blob/master/Documentation/dev-guide/interacting_v3.md中的步骤与etcd服务器进行交互。如果它没有错误,我会说这是一个prometheus问题,因为没有尊重insecure_skip_verify: true配置。

--insecure-skip-tls-verify=true   skip server certificate verification
--insecure-transport=true         disable transport security for client connections