配置Prometheus以从Dockerized Node.js Pod收集自定义指标

时间:2020-06-18 14:43:02

标签: docker kubernetes prometheus kubernetes-helm

我已经设置了prom-client(prometheus的非官方客户端库)来收集我需要的自定义指标。 我在eks setup guide之后从头盔部署了普罗米修斯服务器。现在,我正在尝试编辑默认的configmap来收集我的应用程序指标,但出现错误

parsing YAML file /etc/config/prometheus.yml: yaml: unmarshal errors:\n line 22: field cluster_ip not found in type kubernetes.plain\n line 25: cannot unmarshal !!str默认into []string

这是我根据文档所做的 prometheus.yaml配置文件

apiVersion: v1
data:
  alerting_rules.yml: |
    {}
  alerts: |
    {}
  prometheus.yml: |
    global:
      evaluation_interval: 1m
      scrape_interval: 1m
      scrape_timeout: 10s
    rule_files:
    - /etc/config/recording_rules.yml
    - /etc/config/alerting_rules.yml
    - /etc/config/rules
    - /etc/config/alerts
    scrape_configs:
    ...DEFAULT CONFIGS...
    - job_name: my_metrics
      scrape_interval: 5m
      scrape_timeout: 10s
      honor_labels: true
      metrics_path: /api/metrics
      kubernetes_sd_configs:
        - role: service
          cluster_ip: 10.100.200.92
          namespaces:
            names:
              default
  recording_rules.yml: |
    {}
  rules: |
    {}
kind: ConfigMap
metadata:
  creationTimestamp: "2020-06-08T09:26:38Z"
  labels:
    app: prometheus
    chart: prometheus-11.3.0
    component: server
    heritage: Helm
    release: prometheus
  name: prometheus-server
  namespace: prometheus
  uid: 8fadb17a-f5c5-4f9d-a931-fa1f77684847

这是为我的服务分配的IP,以公开部署。

我的deployment.yaml文件

apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp
spec:
  replicas: 2
  strategy:
    type: RollingUpdate
  selector:
    matchLabels:
      name: myapp
  template:
    metadata:
      labels:
        name: myapp
    spec:
      containers:
        - image: IMAGE_URL:BUILD_NUMBER
          name: myapp
          resources:
              limits:
                cpu: "1000m"
                memory: "2400Mi"
              requests:
                cpu: "500m"
                memory: "2000Mi"
          imagePullPolicy: IfNotPresent
          ports:
              - containerPort: 5000
                name: myapp

我的service.yaml文件正在公开部署

apiVersion: v1
kind: Service
metadata:
  name: myapp
spec:
  selector:
    deploy: staging
    name: myapp
  type: ClusterIP
  ports:
    - port: 80
      targetPort: 5000
      protocol: TCP

是否有一些不同/有效的方法来定位我的应用以进行指标收集,请告诉我。谢谢

1 个答案:

答案 0 :(得分:2)

这就是我用来在群集内部启用Prometheus抓取的功能。

在scrape配置中,我有以下代码段:

      - job_name: 'kubernetes-pods'
        kubernetes_sd_configs:
          - role: pod
        relabel_configs:
          - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
            action: keep
            regex: true
          - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
            action: replace
            target_label: __metrics_path__
            regex: (.+)
          - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
            action: replace
            regex: ([^:]+)(?::\d+)?;(\d+)
            replacement: $1:$2
            target_label: __address__
          - action: labelmap
            regex: __meta_kubernetes_pod_label_(.+)
          - source_labels: [__meta_kubernetes_namespace]
            action: replace
            target_label: kubernetes_namespace
          - action: labeldrop
            regex: '(kubernetes_pod|app_kubernetes_io_instance|app_kubernetes_io_name|instance)'

这直接取自普罗米修斯舵图的默认值:https://github.com/helm/charts/blob/master/stable/prometheus/values.yaml#L1452

它的作用是指示Prometheus刮擦每个具有注释的pod: prometheus.io/scrape: "true" 组。通过 pod 上的这些注释,您可以配置抓取的端口和路径:

prometheus.io/path: "/metrics"
prometheus.io/port: "9090"

因此,您还需要修改deployment.yaml以指定这些注释:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp
spec:
  replicas: 2
  strategy:
    type: RollingUpdate
  selector:
    matchLabels:
      name: myapp
  template:
    metadata:
      labels:
        name: myapp
    annotations:
      prometheus.io/scrape: "true"
      prometheus.io/port: "<enter port of pod to scrape>"
      prometheus.io/path: "<enter path to scrape>"
    spec:
      containers:
        - image: IMAGE_URL:BUILD_NUMBER
...