基于Kubernetes指标的Google Cloud GKE水平Pod自动缩放

时间:2020-11-06 18:16:41

标签: kubernetes google-cloud-platform google-kubernetes-engine horizontal-pod-autoscaling

enter image description here

我想使用pod网络在HPA上接收的字节数计数标准kubernetes指标。使用以下yaml来完成此操作,但会收到无法从自定义指标API获取指标的错误:未注册自定义指标API(custom.metrics.k8s.io)

apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: xxxx-hoa
  namespace: xxxxx
spec:
  scaleTargetRef:
    apiVersion: apps/v1beta1
    kind: Deployment
    name: xxxx-xxx
  minReplicas: 2
  maxReplicas: 6
  metrics:
  - type: Pods
    pods:
      metricName: received_bytes_count
      targetAverageValue: 20k

如果有人有使用相同类型指标的经验,那将非常有帮助

enter image description here

2 个答案:

答案 0 :(得分:1)

autoscaling / v1是一个API,目的是仅基于CPU利用率进行自动缩放。因此,为了基于其他指标进行自动缩放,您应该使用autoscaling / v2beta2。我建议您阅读此doc来检查API版本。

答案 1 :(得分:1)

解决方案

要使其正常工作,您需要部署Stackdriver Custom Metrics Adapter。下面是部署它的命令。

$ kubectl create clusterrolebinding cluster-admin-binding \
    --clusterrole cluster-admin --user "$(gcloud config get-value account)"

$ kubectl apply -f https://raw.githubusercontent.com/GoogleCloudPlatform/k8s-stackdriver/master/custom-metrics-stackdriver-adapter/deploy/production/adapter_new_resource_model.yaml

稍后,您需要使用正确的Custom Metric,在您的情况下应为kubernetes.io|pod|network|received_bytes_count

说明

Custom and external metrics for autoscaling workloads文档中,您拥有部署StackDriver Adapter所需的信息,然后才能获取自定义指标。

在使用自定义指标之前,必须在Google Cloud项目中启用“监视”,并在群集上安装Stackdriver适配器。

下一步是部署您的应用程序(出于测试目的,我已使用Nginx部署)并创建适当的HPA。

在您的HPA示例中,您遇到了一些问题

apiVersion: autoscaling/v2beta1 ## you can also use autoscaling/v2beta2 if you need more features, however for this scenario is ok
kind: HorizontalPodAutoscaler
metadata:
  name: xxxx-hoa
  namespace: xxxxx # HPA have namespace specified, deployment doesnt have
spec:
  scaleTargetRef:
    apiVersion: apps/v1beta1 # apiVersion: apps/v1beta1 is quite old. In Kubernetes 1.16+ it was changed to apps/v1
    kind: Deployment
    name: xxxx-xxx
  minReplicas: 2
  maxReplicas: 6
  metrics:
  - type: Pods
    pods:
      metricName: received_bytes_count # this metrics should be replaced with kubernetes.io|pod|network|received_bytes_count
      targetAverageValue: 20k

在GKE中,您可以在autoscaling/v2beta1autoscaling/v2beta2之间进行选择。您的案例适用于两个apiVersions,但是,如果您决定使用autoscaling/v2beta2,则需要更改清单语法。

为什么kubernetes.io/pod/network/received_bytes_count? 您指的是Kubernetes指标,this docs中提供了/pod/network/received_bytes_count

为什么用|代替/?如果您选择Stackdriver documentation on Github,则会找到信息。

Stackdriver度量标准具有一种路​​径形式,以“ /”字符分隔,但是Custom Metrics API禁止使用“ /”字符。使用自定义指标-直接通过自定义指标API或通过在HPA中指定自定义指标来使用Stackdriver Adapter时,请将“ /”字符替换为“ |”。例如,要使用custom.googleapis.com/my/custom/metric,请指定custom.googleapis.com | my | custom | metric。

正确的配置

对于v2beta1

apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: xxxx-hoa
spec:
  scaleTargetRef:
    apiVersion: apps/v1 # In your case should be apps/v1beta1 but my deployment was created with apps/v1 apiVersion
    kind: Deployment
    name: nginx
  minReplicas: 2
  maxReplicas: 6
  metrics:
  - type: Pods
    pods:
      metricName: "kubernetes.io|pod|network|received_bytes_count"
      targetAverageValue: 20k

对于v2beta2

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: xxxx-hoa
  namespace: default
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: nginx
  minReplicas: 2
  maxReplicas: 6
  metrics:
  - type: Pods
    pods:
      metric:
        name: "kubernetes.io|pod|network|received_bytes_count"
      target:
        type: AverageValue
        averageValue: 20k

测试输出

Conditions:
  Type            Status  Reason            Message
  ----            ------  ------            -------
  AbleToScale     True    SucceededRescale  the HPA controller was able to update the target scale to 2
  ScalingActive   True    ValidMetricFound  the HPA was able to successfully calculate a replica count from pods metric kubernetes.io|pod|network|received_bytes_count
  ScalingLimited  True    TooFewReplicas    the desired replica count is more than the maximum replica count
Events:
  Type    Reason             Age                 From                       Message
  ----    ------             ----                ----                       -------
  Normal  SuccessfulRescale  8m18s               horizontal-pod-autoscaler  New size: 4; reason: pods metric kubernetes.io|pod|network|received_bytes_count above target
  Normal  SuccessfulRescale  8m9s                horizontal-pod-autoscaler  New size: 6; reason: pods metric kubernetes.io|pod|network|received_bytes_count above target
  Normal  SuccessfulRescale  17s                 horizontal-pod-autoscaler  New size: 5; reason: All metrics below target
  Normal  SuccessfulRescale  9s (x2 over 8m55s)  horizontal-pod-autoscaler  New size: 2; reason: All metrics below target

您当前的配置可能存在的问题

在HPA中,您已指定名称空间,但未在目标Deployment中指定。 HPA和部署都应具有相同的名称空间。使用这种不匹配的配置,您可能会遇到以下问题:

Conditions:
  Type         Status  Reason          Message
  ----         ------  ------          -------
  AbleToScale  False   FailedGetScale  the HPA controller was unable to get the target's current scale: deployments/scale.apps "nginx" not found
Events:
  Type     Reason          Age                  From                       Message
  ----     ------          ----                 ----                       -------
  Warning  FailedGetScale  94s (x264 over 76m)  horizontal-pod-autoscaler  deployments/scale.apps "nginx" not found

在Kubernetes 1.16+中,部署使用的是apiVersion: apps/v1,您将无法在Kubernets 1.16+中使用apiVersion: apps/v1beta1创建部署。