使用k8集群外的Prometheus监视Kubernetes集群

时间:2019-10-22 11:27:42

标签: kubernetes monitoring prometheus

  • 我们有kubernetes集群,我在其中创建了服务帐户“ kube”,并创建了名称空间“ monitoring”,并创建了集群角色绑定来监视集群
  • 我们将Prometheus安装在群集外部的linux系统上(在本地),并使用“ root”安装
  • 当我尝试使用ca.crt和用户token(由kubernetes管理员提供)使用https api连接到k8集群时,它将引发多个错误。

错误消息:

component="discovery manager scrape" msg="Cannot create service discovery" err="unable to use specified CA cert /root/prometheus/ca.crt" type=*kubernetes.SDConfig

component="discovery manager scrape" msg="Cannot create service discovery" err="unable to use specified CA cert /root/prometheus/ca.crt" type=*kubernetes.SDConfig

Prometheus配置:


  - job_name: 'kubernetes-apiservers'
    scheme: https
    tls_config:
      ca_file: /root/prometheus/ca.crt
    bearer_token_file: /root/prometheus/user_token
    kubernetes_sd_configs:
    - role: endpoints
      api_server: https://example.com:1234
      bearer_token_file: /root/prometheus/user_token
      tls_config:
        ca_file: /root/prometheus/prometheus-2.12.0.linux-amd64/ca.crt
    relabel_configs:
    - source_labels: [monitoring, monitoring-sa, 6443]
      action: keep
      regex: default;kubernetes;https

  - job_name: 'kubernetes-nodes'
    scheme: https
    tls_config:
        ca_file: /root/prometheus/ca.crt
    bearer_token_file: /root/prometheus/user_token

    kubernetes_sd_configs:
    - role: node
      api_server: https://example.com:1234
      bearer_token_file: /root/prometheus/user_token
      tls_config:
        ca_file: /root/prometheus/ca.crt
    relabel_configs:
    - action: labelmap
      regex: __meta_kubernetes_node_label_(.+)
    - target_label: __address__
      replacement: https://example.com:1234
    - source_labels: [__meta_kubernetes_node_name]
      regex: (.+)
      target_label: __metrics_path__
      replacement: /api/v1/nodes/${1}/proxy/metrics

3 个答案:

答案 0 :(得分:1)

您面临的主要问题是:"unable to use specified CA cert /root/prometheus/ca.crt"

最近有人遇到了同样的问题: https://github.com/prometheus/prometheus/issues/6015#issuecomment-532058465

他通过重新安装新版本解决了该问题。

版本2.13.1已发布。尝试安装最新版本,它也可能会解决您的问题。

答案 1 :(得分:0)

也许您的ca.crt出错,请检查您的ca cert文件,并确保该文件格式如下:

-----BEGIN CERTIFICATE-----
xxxxx
-----END CERTIFICATE-----

我认为您的ca.crt是由kubectl get serviceaccount -o yaml获取的,但这是您的kubernetes集群的公钥,因此,如果要获取令牌,可以指定serviceAccountName在yaml文件中添加新的Deployment,如下所示:

kind: Deployment
apiVersion: extensions/v1beta1
metadata:
  name: test
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: test
        version: v1
    spec:
      serviceAccountName: prometheus
      containers:
      - name: test
        image: alpine
        imagePullPolicy: Always
        command: ["ping", "127.0.0.1"]
      imagePullSecrets:
        - name: harbor-secret
      restartPolicy: Always

然后,将您的tokenca.crt放在/var/run/secrets/kubernetes.io/serviceaccount/下。

答案 2 :(得分:0)

您的ca.crt很有可能仍为base64格式,因为如描述here所述,机密在描述它们时是以这种方式编码的。