GKE Kubernetes节点池升级非常慢

时间:2018-03-21 17:15:16

标签: kubernetes google-kubernetes-engine

我正在尝试在我们的暂存或生产群集上尝试在6个节点(在两个节点池中)测试群集中进行GKE群集升级。升级时,我只有12个副本nginx部署,nginx入口控制器和cert-manager(作为helm图表)安装每个节点池需要10分钟(3个节点)。我很满意。我决定再试一次看起来更像我们设置的东西。我删除了nginx部署并添加了2个node.js部署,以下是头盔图:mongodb-0.4.27,mcrouter-0.1.0(作为statefulset),redis-ha-2.0.0,以及我自己的www-redirect- 0.0.1图表(进行重定向的简单nginx)。问题似乎与mcrouter有关。节点开始耗尽后,该节点的状态将更改为Ready,SchedulingDisabled(这似乎正常),但以下窗格仍然存在:

  • mcrouter-memcached的-0
  • fluentd-GCP-v2.0.9-4f87t
  • KUBE代理-GKE测试升级簇 - 缺省 - 池74f8edac-wblf

我不知道为什么这两个kube系统吊舱仍然存在,但那个mcrouter是我的,并且它的速度不够快。如果我等待足够长的时间(1小时+),那么它最终会起作用,我不知道为什么。当前节点池(3个节点)在2h46分钟前开始升级,2个节点升级,第3个节点仍在升级,但没有任何动态......我认为它将在接下来的1-2个小时内完成... 我试图用--ignore-daemonsets --force运行排水命令,但它告诉我它已经耗尽了。 我试图删除pod,但他们只是回来了,升级不会更快。 有什么想法吗?

更新#1

mcrouter helm图表安装如下:

helm install stable/mcrouter --name mcrouter --set controller=statefulset

它为mcrouter部分创建的状态集是:

apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
  labels:
    app: mcrouter-mcrouter
    chart: mcrouter-0.1.0
    heritage: Tiller
    release: mcrouter
  name: mcrouter-mcrouter
spec:
  podManagementPolicy: OrderedReady
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: mcrouter-mcrouter
      chart: mcrouter-0.1.0
      heritage: Tiller
      release: mcrouter
  serviceName: mcrouter-mcrouter
  template:
    metadata:
      labels:
        app: mcrouter-mcrouter
        chart: mcrouter-0.1.0
        heritage: Tiller
        release: mcrouter
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchLabels:
                app: mcrouter-mcrouter
                release: mcrouter
            topologyKey: kubernetes.io/hostname
      containers:
      - args:
        - -p 5000
        - --config-file=/etc/mcrouter/config.json
        command:
        - mcrouter
        image: jphalip/mcrouter:0.36.0
        imagePullPolicy: IfNotPresent
        livenessProbe:
          failureThreshold: 3
          initialDelaySeconds: 30
          periodSeconds: 10
          successThreshold: 1
          tcpSocket:
            port: mcrouter-port
          timeoutSeconds: 5
        name: mcrouter-mcrouter
        ports:
        - containerPort: 5000
          name: mcrouter-port
          protocol: TCP
        readinessProbe:
          failureThreshold: 3
          initialDelaySeconds: 5
          periodSeconds: 10
          successThreshold: 1
          tcpSocket:
            port: mcrouter-port
          timeoutSeconds: 1
        resources:
          limits:
            cpu: 256m
            memory: 512Mi
          requests:
            cpu: 100m
            memory: 128Mi
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /etc/mcrouter
          name: config
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
      volumes:
      - configMap:
          defaultMode: 420
          name: mcrouter-mcrouter
        name: config
  updateStrategy:
    type: OnDelete

这是memcached statefulset:

apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
  labels:
    app: mcrouter-memcached
    chart: memcached-1.2.1
    heritage: Tiller
    release: mcrouter
  name: mcrouter-memcached
spec:
  podManagementPolicy: OrderedReady
  replicas: 5
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: mcrouter-memcached
      chart: memcached-1.2.1
      heritage: Tiller
      release: mcrouter
  serviceName: mcrouter-memcached
  template:
    metadata:
      labels:
        app: mcrouter-memcached
        chart: memcached-1.2.1
        heritage: Tiller
        release: mcrouter
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchLabels:
                app: mcrouter-memcached
                release: mcrouter
            topologyKey: kubernetes.io/hostname
      containers:
      - command:
        - memcached
        - -m 64
        - -o
        - modern
        - -v
        image: memcached:1.4.36-alpine
        imagePullPolicy: IfNotPresent
        livenessProbe:
          failureThreshold: 3
          initialDelaySeconds: 30
          periodSeconds: 10
          successThreshold: 1
          tcpSocket:
            port: memcache
          timeoutSeconds: 5
        name: mcrouter-memcached
        ports:
        - containerPort: 11211
          name: memcache
          protocol: TCP
        readinessProbe:
          failureThreshold: 3
          initialDelaySeconds: 5
          periodSeconds: 10
          successThreshold: 1
          tcpSocket:
            port: memcache
          timeoutSeconds: 1
        resources:
          requests:
            cpu: 50m
            memory: 64Mi
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
  updateStrategy:
    type: OnDelete
status:
  replicas: 0

2 个答案:

答案 0 :(得分:2)

这是一个有点复杂的问题,我绝对不确定这是我的想法,但是......让我们试着了解发生了什么。

您有升级过程,并且群集中有6个节点。系统将使用Drain逐个升级它以从容器中删除所有工作负载。

根据您的设置和副本数量以及工作负载的所需状态,排水流程本身具有更高优先级,而不是节点本身的消耗。

在排水过程中,Kubernetes会尝试在可用日程安排的资源上安排所有工作量。在系统想要排空的节点上进行调度被禁用,您可以在其状态中看到它 - Ready,SchedulingDisabled

因此,Kubernetes调度程序试图在所有可用节点上为您的工作负载找到合适的位置。只要需要将您描述的所有内容放在群集配置中,它就会等待。

现在最重要的事情。您为replicas: 5设置了mcrouter-memcached。由于podAntiAffinity,每个节点不能运行多个副本,并且运行它的节点应该有足够的资源,这是使用resources: ReplicaSet块计算的。

因此,我认为,您的群集没有足够的资源在剩余的5个节点上运行mcrouter-memcached的新副本。例如,在其副本仍未运行的最后一个节点上,由于其他工作负载,您的内存不足。

我认为如果您将replicaset的{​​{1}}设置为4,则会解决问题。或者您可以尝试为该工作负载使用更强大的实例,或者向群集添加一个节点,它也应该有所帮助。

希望我对我的逻辑给出足够的解释,问我是否有不清楚的事情。但首先请尝试通过提供的解决方案来解决问题:)

答案 1 :(得分:0)

The problem was a combination of the minAvailable value from a PodDisruptionBudget (that was part of the memcached helm chart which is a dependency of the mcrouter helm chart) and the replicas value for the memcached replicaset. Both were set to 5 and therefore none of them could be deleted during the drain. I tried changing the minAvailable to 4 but PDB are immutable at this time. What I did was remove the helm chart and replace it.

helm delete --purge myproxy
helm install ./charts/mcrouter-0.1.0-croy.1.tgz --name myproxy --set controller=statefulset --set memcached.replicaCount=5 --set memcached.pdbMinAvailable=4

Once that was done, I was able to get the cluster to upgrade normally.

What I should have done (but only thought about it after) was to change the replicas value to 6, this way I would not have needed to delete and replace the whole chart.

Thank you @AntonKostenko for trying to help me finding this issue. This issue also helped me. Thanks to the folks in Slack@Kubernetes, specially to Paris who tried to get my issue more visibility and the volonteers of the Kubernetes Office Hours (which happened to be yesterday, lucky me!) for also taking a look. Finally, thank you to psycotica0 from Kubernetes Canada to also give me some pointers.