Google Cloud上的Kubernetes 1.7:FailedSync错误同步pod,SandboxChanged Pod沙箱发生了变化,它将被杀死并重新创建

时间:2017-10-25 01:21:17

标签: docker kubernetes google-cloud-platform google-kubernetes-engine

我的Kubernetes豆荚和容器没有启动。他们陷入状态ContainerCreating

我运行了命令kubectl describe po PODNAME,其中列出了事件,我看到以下错误:

Type        Reason            Message
Warning     FailedSync        Error syncing pod
Normal      SandboxChanged    Pod sandbox changed, it will be killed and re-created.

Count列表示这些错误一次又一次地重复,大约每秒一次。此命令下面的完整输出如下,但我该如何调试呢?我甚至不确定这些错误是什么意思。

Name:           ocr-extra-2939512459-3hkv1
Namespace:      ocr-da-cluster
Node:           gke-da-ocr-api-gce-cluster-extra-pool-65029b63-6qs2/10.240.0.11
Start Time:     Tue, 24 Oct 2017 21:05:01 -0400
Labels:         component=ocr
                pod-template-hash=2939512459
                role=extra
Annotations:    kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicaSet","namespace":"ocr-da-cluster","name":"ocr-extra-2939512459","uid":"d58bd050-b8f3-11e7-9f9e-4201...
Status:         Pending
IP:
Created By:     ReplicaSet/ocr-extra-2939512459
Controlled By:  ReplicaSet/ocr-extra-2939512459
Containers:
  ocr-node:
    Container ID:
    Image:              us.gcr.io/ocr-api/ocr-image
    Image ID:
    Ports:              80/TCP, 443/TCP, 5555/TCP, 15672/TCP, 25672/TCP, 4369/TCP, 11211/TCP
    State:              Waiting
      Reason:           ContainerCreating
    Ready:              False
    Restart Count:      0
    Requests:
      cpu:      31
      memory:   10Gi
    Liveness:   http-get http://:http/ocr/live delay=270s timeout=30s period=60s #success=1 #failure=5
    Readiness:  http-get http://:http/_ah/warmup delay=180s timeout=60s period=120s #success=1 #failure=3
    Environment:
      NAMESPACE:        ocr-da-cluster (v1:metadata.namespace)
    Mounts:
      /var/log/apache2 from apachelog (rw)
      /var/log/celery from cellog (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-dhjr5 (ro)
  log-apache2-error:
    Container ID:
    Image:              busybox
    Image ID:
    Port:               <none>
    Args:
      /bin/sh
      -c
      echo Apache2 Error && sleep 90 && tail -n+1 -F /var/log/apache2/error.log
    State:              Waiting
      Reason:           ContainerCreating
    Ready:              False
    Restart Count:      0
    Requests:
      cpu:              20m
    Environment:        <none>
    Mounts:
      /var/log/apache2 from apachelog (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-dhjr5 (ro)
  log-worker-1:
    Container ID:
    Image:              busybox
    Image ID:
    Port:               <none>
    Args:
      /bin/sh
      -c
      echo Celery Worker && sleep 90 && tail -n+1 -F /var/log/celery/worker*.log
    State:              Waiting
      Reason:           ContainerCreating
    Ready:              False
    Restart Count:      0
    Requests:
      cpu:              20m
    Environment:        <none>
    Mounts:
      /var/log/celery from cellog (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-dhjr5 (ro)
Conditions:
  Type          Status
  Initialized   True
  Ready         False
  PodScheduled  True
Volumes:
  apachelog:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
  cellog:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
  default-token-dhjr5:
    Type:       Secret (a volume populated by a Secret)
    SecretName: default-token-dhjr5
    Optional:   false
QoS Class:      Burstable
Node-Selectors: beta.kubernetes.io/instance-type=n1-highcpu-32
Tolerations:    node.alpha.kubernetes.io/notReady:NoExecute for 300s
                node.alpha.kubernetes.io/unreachable:NoExecute for 300s
Events:
  FirstSeen     LastSeen        Count   From                                                        SubObjectPath       Type            Reason                  Message
  ---------     --------        -----   ----                                                        -------------       --------        ------                  -------
  10m           10m             2       default-scheduler                                                       Warning         FailedScheduling        No nodes are available that match all of the following predicates:: Insufficient cpu (10), Insufficient memory (2), MatchNodeSelector (2).
  10m           10m             1       default-scheduler                                                       Normal          Scheduled               Successfully assigned ocr-extra-2939512459-3hkv1 to gke-da-ocr-api-gce-cluster-extra-pool-65029b63-6qs2
  10m           10m             1       kubelet, gke-da-ocr-api-gce-cluster-extra-pool-65029b63-6qs2                    Normal          SuccessfulMountVolume   MountVolume.SetUp succeeded for volume "apachelog"
  10m           10m             1       kubelet, gke-da-ocr-api-gce-cluster-extra-pool-65029b63-6qs2                    Normal          SuccessfulMountVolume   MountVolume.SetUp succeeded for volume "cellog"
  10m           10m             1       kubelet, gke-da-ocr-api-gce-cluster-extra-pool-65029b63-6qs2                    Normal          SuccessfulMountVolume   MountVolume.SetUp succeeded for volume "default-token-dhjr5"
  10m           1s              382     kubelet, gke-da-ocr-api-gce-cluster-extra-pool-65029b63-6qs2                    Warning         FailedSync              Error syncing pod
  10m           0s              382     kubelet, gke-da-ocr-api-gce-cluster-extra-pool-65029b63-6qs2                    Normal          SandboxChanged          Pod sandbox changed, it will be killed and re-created.

2 个答案:

答案 0 :(得分:2)

检查您的资源限制。我遇到了同样的问题,我的理由是因为我使用@objc func handleTap(sender: UITapGestureRecognizer) { if sender.state == UIGestureRecognizerState.Ended { print("Menu button released") } } 代替m进行内存限制和内存请求。

答案 1 :(得分:0)

您确定需要31个cpu作为初始请求(ocr-node)吗? 这将需要一个非常大的节点。

我看到我的一些豆荚出现了类似的问题。删除它们并允许它们重新创建有时会有所帮助。不一致。 我确信有足够的资源可用。

请参阅Kubernetes pods failing on "Pod sandbox changed, it will be killed and re-created"