Kubelet 指标未出现在普罗米修斯中

时间:2021-07-01 10:05:28

标签: amazon-web-services kubernetes prometheus grafana amazon-eks

我必须为我的 EKS 集群设置一个监控环境。 Prometheus 正在外部节点上运行,我正在尝试使用节点导出器守护程序集来获取指标。 但是在 prometheus 上,当我看到目标时,我无法看到任何目标,而不仅仅是本地主机。

Kubernetes_sd_config 块

global:
  scrape_interval: 15s
scrape_configs:

- job_name: 'prometheus'
  scrape_interval: 15s
  static_configs:
    - targets: ['localhost:9100']


- job_name: 'kubernetes-apiservers'
  kubernetes_sd_configs:
  - role: endpoints
    api_server: https://{{ kubernetes_api_server_addr }}
    tls_config:
      insecure_skip_verify: true
    bearer_token_file: /etc/prometheus/token
  scheme: https
  tls_config:
    insecure_skip_verify: true
  bearer_token_file: /etc/prometheus/token
  relabel_configs:
  - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name]
    action: keep
    regex: default;kubernetes;https
  - target_label: __address__
    replacement: {{ kubernetes_api_server_addr }}


- job_name: 'kubernetes-kube-state'
  tls_config:
    insecure_skip_verify: true
  bearer_token_file: /etc/prometheus/token
  kubernetes_sd_configs:
  - role: pod
    api_server: https://{{ kubernetes_api_server_addr }}
    tls_config:
      insecure_skip_verify: true
    bearer_token_file: /etc/prometheus/token
  scheme: https
  relabel_configs:
  - action: labelmap
    regex: __meta_kubernetes_pod_label_(.+)
  - source_labels: [__meta_kubernetes_namespace]
    action: replace
    target_label: kubernetes_namespace
  - source_labels: [__meta_kubernetes_pod_name]
    action: replace
    target_label: kubernetes_pod_name
  - source_labels: [__meta_kubernetes_pod_label_grafanak8sapp]
    regex: .*true.*
    action: keep
  - target_label: __address__
    replacement: {{ kubernetes_api_server_addr }}
  - source_labels: ['__meta_kubernetes_pod_label_daemon', '__meta_kubernetes_pod_node_name']
    regex: 'node-exporter;(.*)'
    action: replace
    target_label: nodename
  - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_pod_name]
    regex: (.+);(.+)
    target_label: __metrics_path__
    replacement: /api/v1/namespaces/${1}/pods/${2}/proxy/metrics

###################################################################################
# Scrape config for nodes (kubelet).                                              #
#                                                                                 #
# Rather than connecting directly to the node, the scrape is proxied though the   #
# Kubernetes apiserver.  This means it will work if Prometheus is running out of  #
# cluster, or can't connect to nodes for some other reason (e.g. because of       #
# firewalling).                                                                   #
###################################################################################

- job_name: 'kubernetes-kubelet'
  scheme: https
  tls_config:
    insecure_skip_verify: true
  bearer_token_file: /etc/prometheus/token

  kubernetes_sd_configs:
  - role: node
    api_server: https://{{ kubernetes_api_server_addr }}
    tls_config:
      insecure_skip_verify: true
    bearer_token_file: /etc/prometheus/token
  relabel_configs:
  - action: labelmap
    regex: __meta_kubernetes_node_label_(.+)
  - target_label: __address__
    replacement: {{ kubernetes_api_server_addr }}
  - source_labels: [__meta_kubernetes_node_name]
    regex: (.+)
    target_label: __metrics_path__
    replacement: /api/v1/nodes/${1}/proxy/metrics

- job_name: 'kubernetes-cadvisor'
  scheme: https
  tls_config:
    insecure_skip_verify: true
  bearer_token_file: /etc/prometheus/token
  kubernetes_sd_configs:
  - role: node
    api_server: https://{{ kubernetes_api_server_addr }}
    tls_config:
      insecure_skip_verify: true
    bearer_token_file: /etc/prometheus/token
  relabel_configs:
  - action: labelmap
    regex: __meta_kubernetes_node_label_(.+)
  - target_label: __address__
    replacement: {{ kubernetes_api_server_addr }}
  - source_labels: [__meta_kubernetes_node_name]
    regex: (.+)
    target_label: __metrics_path__
    replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor


###################################################################################
# Example scrape config for service endpoints.                                    #
#                                                                                 #
# The relabeling allows the actual service scrape endpoint to be configured       #
# for all or only some endpoints.                                                 #
###################################################################################

- job_name: 'kubernetes-service-endpoints'

  kubernetes_sd_configs:
  - role: endpoints
    api_server: https://{{ kubernetes_api_server_addr }}
    tls_config:
      insecure_skip_verify: true
    bearer_token_file: /etc/prometheus/token

  relabel_configs:
  - action: labelmap
    regex: __meta_kubernetes_service_label_(.+)
  - source_labels: [__meta_kubernetes_namespace]
    action: replace
    target_label: kubernetes_namespace
  - source_labels: [__meta_kubernetes_service_name]
    action: replace
    target_label: kubernetes_name

#########################################################################################
# Example scrape config for probing services via the Blackbox Exporter.                 #
#                                                                                       #
# The relabeling allows the actual service scrape endpoint to be configured             #
# for all or only some services.                                                        #
#########################################################################################

- job_name: 'kubernetes-services'
  kubernetes_sd_configs:
  - role: service
    api_server: https://{{ kubernetes_api_server_addr }}
    tls_config:
      insecure_skip_verify: true
    bearer_token_file: /etc/prometheus/token
  scheme: https
  tls_config:
      insecure_skip_verify: true
  bearer_token_file: /etc/prometheus/token 
  relabel_configs:
  - action: labelmap
    regex: __meta_kubernetes_service_label_(.+)
  - target_label: __address__
    replacement: {{ kubernetes_api_server_addr }}
  - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name]
    regex: (.+);(.+)
    target_label: __metrics_path__
    replacement: /api/v1/namespaces/$1/services/$2/proxy/metrics

##################################################################################
# Example scrape config for pods                                                 #
#                                                                                #
# The relabeling allows the actual pod scrape to be configured                   #
# for all the declared ports (or port-free target if none is declared)           #
# or only some ports.                                                            #
##################################################################################

- job_name: 'kubernetes-pods'

  kubernetes_sd_configs:
  - role: pod
    api_server: https://{{ kubernetes_api_server_addr }}
    tls_config:
      insecure_skip_verify: true
    bearer_token_file: /etc/prometheus/token
  relabel_configs:
  - source_labels: [__address__, __meta_kubernetes_pod_annotation_example_io_scrape_port]
    action: replace
    regex: ([^:]+)(?::\d+)?;(\d+)
    replacement: $1:$2
    target_label: __address__
  - action: labelmap
    regex: __meta_kubernetes_pod_label_(.+)
  - source_labels: [__meta_kubernetes_namespace]
    action: replace
    target_label: kubernetes_namespace
  - source_labels: [__meta_kubernetes_pod_name]
    action: replace
    target_label: kubernetes_pods 
- job_name: 'kubernetes-service-endpoints-e'
  kubernetes_sd_configs:
  - role: endpoints
    api_server: https://{{ kubernetes_api_server_addr }}
    tls_config:
      insecure_skip_verify: true
    bearer_token_file: /etc/prometheus/token
  scheme: https
  tls_config:
    insecure_skip_verify: true
  bearer_token_file: /etc/prometheus/token
  relabel_configs:
  - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
    action: keep
    regex: true
  - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_port]
    action: replace
    regex: (\d+)
    target_label: __meta_kubernetes_pod_container_port_number
  - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
    action: replace
    regex: ()
    target_label: __meta_kubernetes_service_annotation_prometheus_io_path
    replacement: /metrics
  - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_pod_container_port_number, __meta_kubernetes_service_annotation_prometheus_io_path]
    target_label: __metrics_path__
    regex: (.+);(.+);(.+);(.+)
    replacement: /api/v1/namespaces/$1/services/$2:$3/proxy$4
  - target_label: __address__
    replacement: {{ kubernetes_api_server_addr }}
  - action: labelmap
    regex: __meta_kubernetes_service_label_(.+)
  - source_labels: [__meta_kubernetes_namespace]
    action: replace
    target_label: kubernetes_namespace
  - source_labels: [__meta_kubernetes_service_name]
    action: replace
    target_label: kubernetes_name
  - source_labels: [__meta_kubernetes_pod_node_name]
    action: replace
    target_label: instance

这是我的 prometheus 实例上的 Prometheus.yml 文件。

Prometheus 实例日志 /var/log/messages

Jul  1 15:18:53 ip-XXXXXXXXXXX prometheus: ts=2021-07-01T15:18:53.655Z caller=log.go:124 component=k8s_client_runtime level=debug func=Verbose.Infof msg="Listing and watching *v1.Endpoints from pkg/mod/k8s.io/client-go@v0.21.1/tools/cache/reflector.go:167"
Jul  1 15:18:53 ip-XXXXXXXXXXX prometheus: ts=2021-07-01T15:18:53.676Z caller=log.go:124 component=k8s_client_runtime level=debug func=Infof msg="GET https://XXXXXXXXXXXXXXXXXXXXXXX.eks.amazonaws.com/api/v1/endpoints?limit=500&resourceVersion=0  in 20 milliseconds"
Jul  1 15:18:53 ip-XXXXXXXXXXX prometheus: ts=2021-07-01T15:18:53.676Z caller=log.go:124 component=k8s_client_runtime level=error func=ErrorDepth msg="pkg/mod/k8s.io/client-go@v0.21.1/tools/cache/reflector.go:167: Failed to watch *v1.Endpoints: failed to list *v1.Endpoints: Get \"https://XXXXXXXXXXXXXXXXXXXXXXX.eks.amazonaws.com/api/v1/endpoints?limit=500&resourceVersion=0\": x509: certificate signed by unknown authority"
Jul  1 15:18:56 ip-XXXXXXXXXXX prometheus: ts=2021-07-01T15:18:56.445Z caller=log.go:124 component=k8s_client_runtime level=debug func=Verbose.Infof msg="Listing and watching *v1.Pod from pkg/mod/k8s.io/client-go@v0.21.1/tools/cache/reflector.go:167"
Jul  1 15:18:56 ip-XXXXXXXXXXX prometheus: ts=2021-07-01T15:18:56.445Z caller=log.go:124 component=k8s_client_runtime level=debug func=Infof msg="GET https://XXXXXXXXXXXXXXXXXXXXXXX.eks.amazonaws.com/api/v1/pods?limit=500&resourceVersion=0  in 0 milliseconds"
Jul  1 15:18:56 ip-XXXXXXXXXXX prometheus: ts=2021-07-01T15:18:56.445Z caller=log.go:124 component=k8s_client_runtime level=error func=ErrorDepth msg="pkg/mod/k8s.io/client-go@v0.21.1/tools/cache/reflector.go:167: Failed to watch *v1.Pod: failed to list *v1.Pod: Get \"https://XXXXXXXXXXXXXXXXXXXXXXX.eks.amazonaws.com/api/v1/pods?limit=500&resourceVersion=0\": unable to read authorization credentials file /etc/prometheus/token: open /etc/prometheus/token: no such file or directory"

1 个答案:

答案 0 :(得分:1)

您分享的日志指向了问题:

... unable to read authorization credentials file /etc/prometheus/token: open /etc/prometheus/token: no such file or directory"

集群内工作负载的令牌文件默认安装在 /var/run/secrets/kubernetes.io/serviceaccount/token 但由于您提到 Prometheus 在“外部节点”上运行(不知道您的意思)这可能对也可能没有用你可以改变。