Kubernetes-Daemonset错误原因

时间:2018-06-21 12:54:33

标签: kubernetes elastic-stack

具有一个k8s守护程序集,该守护程序集仅应在部署了pod的主机节点上设置sysctl -w vm.max_map_count=262144。第一次应用reosurce时,守护程序集会按预期工作,但是,如果稍后重新运行运行在其上的k8s节点,则守护程序集pod不会将主机的vm.max_map_count更新为262144ds窗格进入运行状态,但在描述时它们显示:

State:          Running
  Started:      Thu, 21 Jun 2018 12:01:51 +0100
Last State:     Terminated
  Reason:       Error
  Exit Code:    143

但是我无法找出错误的原因,并且不知道在哪里寻找解决问题的方法?

daemonset yaml:

kind: DaemonSet
apiVersion: extensions/v1beta1
metadata:
  name: ds-elk
  labels:
    app: elk
spec:
  template:
    metadata:
      labels:
        app: elk
    spec:
      hostPID: true
      containers:
        - name: startup-script
          image: gcr.io/google-containers/startup-script:v1
          imagePullPolicy: Always
          securityContext:
            privileged: true
          env:
          - name: STARTUP_SCRIPT
            value: |
              #! /bin/bash
              sysctl -w vm.max_map_count=262144
              echo done

主机是Red Hat EL 7.4。 Kubernetes服务器版本1.8.6

kubectl describe pod ds-elk-5z5hs输出:

Name:           ds-elk-5z5hs
Namespace:      default
Node:           xxx-00-xxxx-01v.devxxx.xxxxxx.xx.xx/xx.xxx.xx.xx
Start Time:     Tue, 15 May 2018 14:03:14 +0100
Labels:         app=elk
                controller-revision-hash=2068481183
                pod-template-generation=1
Annotations:    kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"DaemonSet","namespace":"default","name":"ds-elk","uid":"54372241-5840-11e8-aaaa-005056b97218","apiVersion"...
Status:         Running
IP:             xx.xxx.x.xxx
Controlled By:  DaemonSet/ds-elk
Containers:
  startup-script:
    Container ID:   docker://eff849b842ed7b28dcf07578301a12068c998cb42b59a88b2bf2e8243b72f419
    Image:          gcr.io/google-containers/startup-script:v1
    Image ID:       docker-pullable://gcr.io/google-containers/startup-script@sha256:be96df6845a2af0eb61b17817ed085ce41048e4044c541da7580570b61beff3e
    Port:           <none>
    State:          Running
      Started:      Thu, 21 Jun 2018 11:40:50 +0100
    Last State:     Terminated
      Reason:       Error
      Exit Code:    143
      Started:      Thu, 21 Jun 2018 07:24:56 +0100
      Finished:     Thu, 21 Jun 2018 11:39:22 +0100
    Ready:          True
    Restart Count:  2
    Environment:
      STARTUP_SCRIPT:  #! /bin/bash
sysctl -w vm.max_map_count=262144
echo done

    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-ld98j (ro)
Conditions:
  Type           Status
  Initialized    True
  Ready          True
  PodScheduled   True
Volumes:
  default-token-ld98j:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-ld98j
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.alpha.kubernetes.io/notReady:NoExecute
                 node.alpha.kubernetes.io/unreachable:NoExecute
                 node.kubernetes.io/disk-pressure:NoSchedule
                 node.kubernetes.io/memory-pressure:NoSchedule
Events:          <none>

2 个答案:

答案 0 :(得分:0)

脚本使用/tmp/startup-script.kubernetes.io文件作为脚本已运行一次的标记。该文件放置在您的节点/tmp目录而不是容器中。因此,下一次DaemonSet将广告连播安排到该节点时,脚本将进入休眠状态。

以下是要查看的脚本:https://github.com/kubernetes/contrib/blob/master/startup-script/manage-startup-script.sh

请注意,您引用的图像并非完全基于此版本的代码构建。特别是,它不使用后缀md5sum来允许脚本在更改脚本代码后运行。

答案 1 :(得分:0)

最终完全摆脱了daemonset,而是在广告连播的vm.max_map_count规范中设置了initContainers

  initContainers:
  - name: "sysctl"
    image: "busybox"
    imagePullPolicy: "Always"
    command: ["sysctl", "-w", "vm.max_map_count=262144"]
    securityContext:
      privileged: true