Question

我正在单个集群上使用Google Kubernetes Engine。群集会自动扩展节点数量。我已经创建了三个部署，并使用网站（工作负载->部署->操作->自动缩放）设置了自动缩放策略，因此没有手动编写YAML配置。基于官方guide，我没有犯任何错误。

如果您未指定请求，则只能基于资源利用率的绝对值，例如用于 CPU利用率。

以下是完整部署的YAML：

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: student
  name: student
  namespace: ulibretto
spec:
  replicas: 1
  selector:
    matchLabels:
      app: student
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      labels:
        app: student
    spec:
      containers:
        - env:
            - name: CLUSTER_HOST
              valueFrom:
                configMapKeyRef:
                  key: CLUSTER_HOST
                  name: shared-env-vars
            - name: BIND_HOST
              valueFrom:
                configMapKeyRef:
                  key: BIND_HOST
                  name: shared-env-vars
            - name: TOKEN_TIMEOUT
              valueFrom:
                configMapKeyRef:
                  key: TOKEN_TIMEOUT
                  name: shared-env-vars
          image: gcr.io/ulibretto/github.com/ulibretto/studentservice
          imagePullPolicy: IfNotPresent
          name: studentservice-1
---
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  labels:
    app: student
  name: student-hpa-n3bp
  namespace: ulibretto
spec:
  maxReplicas: 100
  metrics:
    - resource:
        name: cpu
        targetAverageUtilization: 80
      type: Resource
  minReplicas: 1
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: student
---
apiVersion: v1
kind: Service
metadata:
  annotations:
    cloud.google.com/neg: '{"ingress":true}'
  labels:
    app: student
  name: student-ingress
  namespace: ulibretto
spec:
  clusterIP: 10.44.5.59
  ports:
    - port: 5000
      protocol: TCP
      targetPort: 5000
  selector:
    app: student
  sessionAffinity: None
  type: ClusterIP

问题在于HPA看不到指标（平均CPU利用率），这确实很奇怪（请参见图片）。 HPA cannot read metric value

我缺少什么？

Answer 1

已编辑

您是对的。您不必像我之前提到的那样在namespace: ulibretto中指定scaleTargetRef:。

当您提供所有YAML时，我就能找到正确的根本原因。

如果您选中GKE docs，则会在代码中找到注释

    resources:
      # You must specify requests for CPU to autoscale
      # based on CPU utilization
      requests:
        cpu: "250m"

您的部署未指定resource requests。我对此进行了尝试（由于无法部署您的容器并更改了HPA中的apiVersion，我删除了一些部分）：

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: student
  name: student
  namespace: ulibretto
spec:
  replicas: 3
  selector:
    matchLabels:
      app: student
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      labels:
        app: student
    spec:
      containers:
      - image: nginx
        imagePullPolicy: IfNotPresent
        name: studentservice-1
        resources:
          requests:
            cpu: "250m"
---
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  labels:
    app: student
  name: student-hpa
  namespace: ulibretto
spec:
  maxReplicas: 100
  minReplicas: 1
  targetCPUUtilizationPercentage: 80
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: student

$ kubectl get all -n ulibretto
NAME                           READY   STATUS    RESTARTS   AGE
pod/student-6f797d5888-84xfq   1/1     Running   0          7s
pod/student-6f797d5888-b7ctq   1/1     Running   0          7s
pod/student-6f797d5888-fbtmd   1/1     Running   0          7s
NAME                      READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/student   3/3     3            3           7s
NAME                                 DESIRED   CURRENT   READY   AGE
replicaset.apps/student-6f797d5888   3         3         3       7s
NAME                                              REFERENCE            TARGETS         MINPODS   MAXPODS   REPLICAS   AGE
horizontalpodautoscaler.autoscaling/student-hpa   Deployment/student   <unknown>/80%   1         100       0          7s

〜1-5分钟后，您将收到一些指标。

$ kubectl get all -n ulibretto
NAME                           READY   STATUS    RESTARTS   AGE
pod/student-6f797d5888-84xfq   1/1     Running   0          95s
pod/student-6f797d5888-b7ctq   1/1     Running   0          95s
pod/student-6f797d5888-fbtmd   1/1     Running   0          95s

NAME                      READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/student   3/3     3            3           95s

NAME                                 DESIRED   CURRENT   READY   AGE
replicaset.apps/student-6f797d5888   3         3         3       95s

NAME                                              REFERENCE            TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
horizontalpodautoscaler.autoscaling/student-hpa   Deployment/student   0%/80%    1         100       3          95s

相同的情况，如果您想使用CLI创建HPA：

$ kubectl autoscale deployment student -n ulibretto --cpu-percent=50 --min=1 --max=100
horizontalpodautoscaler.autoscaling/student autoscaled

$ kubectl get hpa -n ulibretto
NAME      REFERENCE            TARGETS         MINPODS   MAXPODS   REPLICAS   AGE
student   Deployment/student   <unknown>/50%   1         100       0          3s

过一会儿，您将收到0%而不是<unknown>

$ kubectl get all -n ulibretto
NAME                           READY   STATUS    RESTARTS   AGE
pod/student-6f797d5888-84xfq   1/1     Running   0          4m4s
pod/student-6f797d5888-b7ctq   1/1     Running   0          4m4s
pod/student-6f797d5888-fbtmd   1/1     Running   0          4m4s
NAME                      READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/student   3/3     3            3           4m5s
NAME                                 DESIRED   CURRENT   READY   AGE
replicaset.apps/student-6f797d5888   3         3         3       4m5s
NAME                                          REFERENCE            TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
horizontalpodautoscaler.autoscaling/student   Deployment/student   0%/50%    1         100       3          58s

HPA无法读取GKE上的指标值（CPU利用率）

1 个答案: