GKE K8 HPA无法获取堆栈驱动程序指标

时间:2020-01-09 12:52:03

标签: kubernetes google-cloud-platform google-kubernetes-engine

我们有一个k8 gke集群,我们希望通过我们的应用程序逻辑暴露给stackdriver的自定义指标来扩展pod

我能够推送指标并能够在指标浏览器中查看 图片

我们能够在k8自定义指标列表中查看指标 kubectl获取--raw /apis/custom.metrics.k8s.io/v1beta1 | python -m json.tool | grep -a10 num_drivers_per_pod

{
            "kind": "MetricValueList",
            "name": "*/custom.googleapis.com|num_drivers_per_pod",
            "namespaced": true,
            "singularName": "",
            "verbs": [
                "get"
            ]
        }

我们已经成功安装了堆栈驱动程序适配器,并且正在与heapster一起运行

但是当我们部署给定的HPA清单

apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: custom-metric-sd-num-drivers
  namespace: default
spec:
  scaleTargetRef:
    apiVersion: apps/v1beta1
    kind: Deployment
    name: test-ws-api-server
  minReplicas: 1
  maxReplicas: 5
  metrics:
    - type: Pods
      pods:
        metricName: "num_drivers_per_pod"
        targetAverageValue: 2

k8集群无法通过以下消息获取指标

Name:               custom-metric-sd-num-drivers
Namespace:          default
Labels:             <none>
Annotations:        autoscaling.alpha.kubernetes.io/conditions:
                      [{"type":"AbleToScale","status":"True","lastTransitionTime":"2020-01-07T14:26:25Z","reason":"SucceededGetScale","message":"the HPA control...
                    autoscaling.alpha.kubernetes.io/current-metrics:
                      [{"type":"External","external":{"metricName":"custom.googleapis.com|num_drivers_per_pod","currentValue":"0","currentAverageValue":"1"}}]
                    autoscaling.alpha.kubernetes.io/metrics: [{"type":"Pods","pods":{"metricName":"num_drivers_per_pod","targetAverageValue":"2"}}]
                    kubectl.kubernetes.io/last-applied-configuration:
                      {"apiVersion":"autoscaling/v2beta1","kind":"HorizontalPodAutoscaler","metadata":{"annotations":{},"name":"custom-metric-sd-num-drivers","n...
CreationTimestamp:  Tue, 07 Jan 2020 19:56:10 +0530
Reference:          Deployment/test-ws-api-server
Min replicas:       1
Max replicas:       5
Deployment pods:    1 current / 1 desired
Events:
  Type     Reason               Age                   From                       Message
  ----     ------               ----                  ----                       -------
  Warning  FailedGetPodsMetric  47s (x6237 over 27h)  horizontal-pod-autoscaler  unable to get metric num_drivers_per_pod: no metrics returned from custom metrics API

以下是用于推送指标的代码

def put_k8_pod_metric(metric_name,value,metric_type="k8s_pod"):
    try:
        client = monitoring_v3.MetricServiceClient()
        series = monitoring_v3.types.TimeSeries()
        series.metric.type = f'custom.googleapis.com/{metric_name}'
        series.resource.type = metric_type
        series.resource.labels['project_id'] = os.getenv("PROJECT_NAME")
        series.resource.labels['location'] = os.getenv("POD_LOCATION","asia-south1")
        series.resource.labels['cluster_name'] = os.getenv("CLUSTER_NAME","data-k8cluster")
        series.resource.labels['namespace_name'] = "default"
        series.resource.labels['pod_name'] = os.getenv("MY_POD_NAME","wrong_pod")
        point = series.points.add()
        point.value.double_value = value
        now = time.time()
        point.interval.end_time.seconds = int(now)
        point.interval.end_time.nanos = int(
            (now - point.interval.end_time.seconds) * 10**9)
        project_name = client.project_path(os.getenv('PROJECT_NAME'))
        client.create_time_series(project_name, [series],timeout=2)
        logger.info(f"successfully send the metric {metric_name} with value {value}")
    except Exception as e:
        traceback.print_exc()
        logger.info(f"failed to send the metric {metric_name} with value {value}")

你们能指出哪里看以及是什么引起问题的指针

嘿,刚刚解决了部署apiversion升级以及返回gke_container资源类型的问题。我已经在python中发布了一个简单的仓库,以实现相同的gke-hpa-custom-metric-python

1 个答案:

答案 0 :(得分:0)

我发现this post描述了与您类似的问题。

可能存在与“ external.metrics”和“ custom.metrics”有关的混淆。

此处类型设置为“外部”,但名称表示“自定义”:

[{"type":"External","external":{"metricName":"custom.googleapis.com

应该查看Horizo​​ntalPodAutoscaler中的“类型:”值。

对于自定义指标,应将“类型:对象”指示为“ mentioned here

编辑。

据我了解,这里有四件事:
-MetricValueList
-Stackdriver Metrics Explorer
-Horizo​​ntalPodAutoscaler
-Python脚本

由于您可以在Metric Explorer中查看指标,因此可以排除MetricValueList和Python脚本的问题。

知道这一点后,问题很可能出现在Horizo​​ntalPodAutoscaler或其周围。

此命令未返回任何项目的事实

kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/pods/*/num_drivers_per_pod"