k8s hpa无法获取cpu信息

时间:2019-12-10 08:05:56

标签: kubernetes cpu autoscaling

我设置了一个hpa use命令

sudo kubectl autoscale deployment e7-build-64 --cpu-percent=50 --min=1 --max=2 -n k8s-demo

sudo kubectl get hpa -n k8s-demo

NAME              REFERENCE                TARGETS         MINPODS   MAXPODS   REPLICAS   AGE
e7-build-64       Deployment/e7-build-64   <unknown>/50%   1         2         1          15m

sudo kubectl描述了hpa e7-build-64 -n k8s-demo

Name:                                                  e7-build-64
Namespace:                                             k8s-demo
Labels:                                                <none>
Annotations:                                           <none>
CreationTimestamp:                                     Tue, 10 Dec 2019 15:34:24 +0800
Reference:                                             Deployment/e7-build-64
Metrics:                                               ( current / target )
  resource cpu on pods  (as a percentage of request):  <unknown> / 50%
Min replicas:                                          1
Max replicas:                                          2
Deployment pods:                                       1 current / 0 desired
Conditions:
  Type           Status  Reason                   Message
  ----           ------  ------                   -------
  AbleToScale    True    SucceededGetScale        the HPA controller was able to get the target's current scale
  ScalingActive  False   FailedGetResourceMetric  the HPA was unable to compute the replica count: unable to get metrics for resource cpu: no metrics returned from resource metrics API
Events:
  Type     Reason                        Age                 From                       Message
  ----     ------                        ----                ----                       -------
  Warning  FailedComputeMetricsReplicas  13m (x12 over 16m)  horizontal-pod-autoscaler  invalid metrics (1 invalid out of 1), first error is: failed to get cpu utilization: unable to get metrics for resource cpu: no metrics returned from resource metrics API
  Warning  FailedGetResourceMetric       74s (x61 over 16m)  horizontal-pod-autoscaler  unable to get metrics for resource cpu: no metrics returned from resource metrics API

在deployment.yaml中,我已经添加了资源请求并受到限制

resources:
  limits:
    memory: "16Gi"
    cpu: "4000m"
  requests: 
    memory: "4Gi"
    cpu: "2000m"

kubectl版本

Client Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.2", GitCommit:"c97fe5036ef3df2967d086711e6c0c405941e14b", GitTreeState:"clean", BuildDate:"2019-10-15T19:18:23Z", GoVersion:"go1.12.10", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.2", GitCommit:"c97fe5036ef3df2967d086711e6c0c405941e14b", GitTreeState:"clean", BuildDate:"2019-10-15T19:09:08Z", GoVersion:"go1.12.10", Compiler:"gc", Platform:"linux/amd64"}

然后我尝试使用yaml设置hpa

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: hpa-e7-build-64
  namespace: k8s-demo
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: e7-build-64
  minReplicas: 1
  maxReplicas: 2
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 10

它仍然有一些错误 sudo kubectl描述了hpa hpa-e7-build-64 -n k8s-demo

Name:                                                  hpa-e7-build-64
Namespace:                                             k8s-demo
Labels:                                                <none>
Annotations:                                           kubectl.kubernetes.io/last-applied-configuration:
                                                         {"apiVersion":"autoscaling/v2beta2","kind":"HorizontalPodAutoscaler","metadata":{"annotations":{},"name":"hpa-e7-build-64","namespace":"k8...
CreationTimestamp:                                     Tue, 10 Dec 2019 14:24:07 +0800
Reference:                                             Deployment/e7-build-64
Metrics:                                               ( current / target )
  resource cpu on pods  (as a percentage of request):  <unknown> / 10%
Min replicas:                                          1
Max replicas:                                          2
Deployment pods:                                       1 current / 0 desired
Conditions:
  Type           Status  Reason                   Message
  ----           ------  ------                   -------
  AbleToScale    True    SucceededGetScale        the HPA controller was able to get the target's current scale
  ScalingActive  False   FailedGetResourceMetric  the HPA was unable to compute the replica count: unable to get metrics for resource cpu: no metrics returned from resource metrics API
Events:
  Type     Reason                        Age                    From                       Message
  ----     ------                        ----                   ----                       -------
  Warning  FailedGetResourceMetric       59m (x141 over 94m)    horizontal-pod-autoscaler  unable to get metrics for resource cpu: unable to fetch metrics from resource metrics API: the server could not find the requested resource (get pods.metrics.k8s.io)
  Warning  FailedGetResourceMetric       54m (x2 over 54m)      horizontal-pod-autoscaler  unable to get metrics for resource cpu: unable to fetch metrics from resource metrics API: the server is currently unable to handle the request (get pods.metrics.k8s.io)
  Warning  FailedComputeMetricsReplicas  39m (x58 over 53m)     horizontal-pod-autoscaler  invalid metrics (1 invalid out of 1), first error is: failed to get cpu utilization: unable to get metrics for resource cpu: no metrics returned from resource metrics API
  Warning  FailedGetResourceMetric       4m29s (x197 over 53m)  horizontal-pod-autoscaler  unable to get metrics for resource cpu: no metrics returned from resource metrics API

我已经执行了以下命令:

git clone https://github.com/kubernetes-incubator/metrics-server.git (fetch)
cd metrics-server/deploy
sudo kubectl create -f 1.8+/

有人知道如何解决吗?

更新:

sudo kubectl get --raw "/apis/metrics.k8s.io/v1beta1/namespaces/k8s-demo/pods"
{"kind":"PodMetricsList","apiVersion":"metrics.k8s.io/v1beta1","metadata":{"selfLink":"/apis/metrics.k8s.io/v1beta1/namespaces/k8s-demo/pods"},"items":[]}

和广告连播信息:

sudo kubectl describe pod metrics-server-795b774c76-fs8hw -n kube-system
Name:         metrics-server-795b774c76-fs8hw
Namespace:    kube-system
Priority:     0
Node:         nandoc-95/192.168.33.225
Start Time:   Tue, 10 Dec 2019 15:04:14 +0800
Labels:       k8s-app=metrics-server
              pod-template-hash=795b774c76
Annotations:  cni.projectcalico.org/podIP: 10.0.229.135/32
Status:       Running
IP:           10.0.229.135
IPs:
  IP:           10.0.229.135
Controlled By:  ReplicaSet/metrics-server-795b774c76
Containers:
  metrics-server:
    Container ID:  docker://2c6dd8c50938bc9ab536c78b73773aa7a9eedd60a6974805beec58e8ee9fde3c
    Image:         k8s.gcr.io/metrics-server-amd64:v0.3.6
    Image ID:      docker-pullable://k8s.gcr.io/metrics-server-amd64@sha256:c9c4e95068b51d6b33a9dccc61875df07dc650abbf4ac1a19d58b4628f89288b
    Port:          4443/TCP
    Host Port:     0/TCP
    Args:
      --cert-dir=/tmp
      --secure-port=4443
    State:          Running
      Started:      Tue, 10 Dec 2019 15:05:13 +0800
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /tmp from tmp-dir (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from metrics-server-token-xjgpx (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  tmp-dir:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  metrics-server-token-xjgpx:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  metrics-server-token-xjgpx
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  beta.kubernetes.io/os=linux
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:          <none>

sudo kubectl获取容器--all-namespaces -o宽

NAMESPACE              NAME                                         READY   STATUS    RESTARTS   AGE    IP               NODE              NOMINATED NODE   READINESS GATES
k8s-demo               k8s-pod-e7-build-32-7bb5bc7c6-s2zsr          1/1     Running   0          32m    10.0.100.198     nandoc-94         <none>           <none>
k8s-demo               k8s-pod-e7-build-64-d5c659d6b-5hv6m          1/1     Running   0          31m    10.0.229.137     nandoc-95         <none>           <none>
kube-system            calico-kube-controllers-55754f75c-82np8      1/1     Running   0          5d     10.0.126.1       nandoc-93         <none>           <none>
kube-system            calico-node-2dxmp                            1/1     Running   0          2d5h   192.168.33.225   nandoc-95         <none>           <none>
kube-system            calico-node-7ms8t                            1/1     Running   0          28d    192.168.33.223   nandoc-93         <none>           <none>
kube-system            calico-node-hdw25                            1/1     Running   1          21d    192.168.33.224   nandoc-94         <none>           <none>
kube-system            calico-node-j4jv4                            0/1     Running   0          27d    192.168.37.173   cyuan-k8s-node1   <none>           <none>
kube-system            calicoctl                                    1/1     Running   0          6d     192.168.33.224   nandoc-94         <none>           <none>
kube-system            coredns-5644d7b6d9-n9z5m                     1/1     Running   0          5d     10.0.126.2       nandoc-93         <none>           <none>
kube-system            coredns-5644d7b6d9-txcm4                     1/1     Running   0          5d     10.0.100.194     nandoc-94         <none>           <none>
kube-system            etcd-nandoc-93                               1/1     Running   0          28d    192.168.33.223   nandoc-93         <none>           <none>
kube-system            kube-apiserver-nandoc-93                     1/1     Running   0          28d    192.168.33.223   nandoc-93         <none>           <none>
kube-system            kube-controller-manager-nandoc-93            1/1     Running   0          28d    192.168.33.223   nandoc-93         <none>           <none>
kube-system            kube-proxy-5jlfc                             1/1     Running   0          27d    192.168.37.173   cyuan-k8s-node1   <none>           <none>
kube-system            kube-proxy-7t7b7                             1/1     Running   0          28d    192.168.33.223   nandoc-93         <none>           <none>
kube-system            kube-proxy-j5b4c                             1/1     Running   0          2d5h   192.168.33.225   nandoc-95         <none>           <none>
kube-system            kube-proxy-jj256                             1/1     Running   1          21d    192.168.33.224   nandoc-94         <none>           <none>
kube-system            kube-scheduler-nandoc-93                     1/1     Running   0          28d    192.168.33.223   nandoc-93         <none>           <none>
kube-system            metrics-server-795b774c76-fs8hw              1/1     Running   0          24h    10.0.229.135     nandoc-95         <none>           <none>
kubernetes-dashboard   dashboard-metrics-scraper-76585494d8-wqgks   1/1     Running   0          5d     10.0.126.3       nandoc-93         <none>           <none>
kubernetes-dashboard   kubernetes-dashboard-b65488c4-qh95m          1/1     Running   0          5d     10.0.126.4       nandoc-93         <none>           <none>

sudo kubectl get hpa --all-namespaces -o宽

NAMESPACE   NAME                  REFERENCE                        TARGETS         MINPODS   MAXPODS   REPLICAS   AGE
k8s-demo    hpa-e7-build-32       Deployment/k8s-pod-e7-build-32   <unknown>/10%   1         2         1          85s
k8s-demo    hpa-e7-build-64       Deployment/k8s-pod-e7-build-64   <unknown>/10%   1         2         1          79s
k8s-demo    k8s-pod-e7-build-64   Deployment/k8s-pod-e7-build-64   <unknown>/50%   1         2         1          16s

我更新了pod名称并重新创建了hpa,今天添加了前缀k8s-pod-。因此输出与以前不同。

2 个答案:

答案 0 :(得分:4)

对于Kubernetes 1.18和Metrics v0.3.7,我们应该编辑指标服务器部署以反映以下参数:

args:
  - --kubelet-insecure-tls
  - --kubelet-preferred-address-types=InternalIP
  - --cert-dir=/tmp
  - --secure-port=4443

答案 1 :(得分:1)

感谢您的加入和EAT_Py。我已经解决了这个问题。 调试过程:

sudo kubectl get --raw "/apis/metrics.k8s.io/v1beta1/namespaces/k8s-demo/pods"
sudo kubectl -n kube-system logs metrics-server-795b774c76-t2rj7
sudo kubectl top node nandoc-94   -->can't get info
sudo kubectl top pod k8s-pod-e7-build-32-7bb5bc7c6-s2zsr   -->can't get info

metrics-server的日志中有一些错误信息:

kubelet_summary:nandoc-93: unable to fetch metrics from Kubelet nandoc-93 (nandoc-93): Get https://nandoc-93:10250/stats/summary?only_cpu_and_memory=true: x509: certificate signed by unknown authority]

然后根据https://github.com/kubernetes-sigs/metrics-server/issues/146 我编辑metrics-server / deploy / 1.8 + / metrics-server-deployment.yaml并添加命令

  - name: metrics-server
    image: k8s.gcr.io/metrics-server-amd64:v0.3.6
    command:
    - /metrics-server
    - --kubelet-insecure-tls

kubectl apply -fmetrics-server-deployment.yaml

之后,kubectl顶部吊舱工作正常。hpa现在工作。再次谢谢你。

sudo kubectl get hpa --all-namespaces
NAMESPACE   NAME              REFERENCE                        TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
k8s-demo    hpa-e7-build-32   Deployment/k8s-pod-e7-build-32   0%/10%    1         2         1          19h
k8s-demo    hpa-e7-build-64   Deployment/k8s-pod-e7-build-64   0%/10%    1         2         1          19h