Kube法兰绒处于CrashLoopBackOff状态

时间:2018-08-30 13:23:42

标签: kubernetes flannel

我们刚刚开始在kubernetes上创建集群。

现在我们尝试部署分till,但出现错误:

  

NetworkPlugin CNI无法设置Pod   “ tiller-deploy-64c9d747bd-br9j7_kube-system”网络:开放   /run/flannel/subnet.env:没有这样的文件或目录

之后我打电话给

  var model = JsonConvert.DeserializeObject<T>(JsonConvert.DeserializeObject<string>(response.Content.ReadAsStringAsync().Result));

得到回应:

kubectl get pods --all-namespaces -o wide

我们有一些法兰绒豆荚处于CrashLoopBackOff状态。例如NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE kube-system coredns-78fcdf6894-ksdvt 1/1 Running 2 7d 192.168.0.4 kube-master <none> kube-system coredns-78fcdf6894-p4l9q 1/1 Running 2 7d 192.168.0.5 kube-master <none> kube-system etcd-kube-master 1/1 Running 2 7d 10.168.209.20 kube-master <none> kube-system kube-apiserver-kube-master 1/1 Running 2 7d 10.168.209.20 kube-master <none> kube-system kube-controller-manager-kube-master 1/1 Running 2 7d 10.168.209.20 kube-master <none> kube-system kube-flannel-ds-amd64-42rl7 0/1 CrashLoopBackOff 2135 7d 10.168.209.17 node5 <none> kube-system kube-flannel-ds-amd64-5fx2p 0/1 CrashLoopBackOff 2164 7d 10.168.209.14 node2 <none> kube-system kube-flannel-ds-amd64-6bw5g 0/1 CrashLoopBackOff 2166 7d 10.168.209.15 node3 <none> kube-system kube-flannel-ds-amd64-hm826 1/1 Running 1 7d 10.168.209.20 kube-master <none> kube-system kube-flannel-ds-amd64-thjps 0/1 CrashLoopBackOff 2160 7d 10.168.209.16 node4 <none> kube-system kube-flannel-ds-amd64-w99ch 0/1 CrashLoopBackOff 2166 7d 10.168.209.13 node1 <none> kube-system kube-proxy-d6v2n 1/1 Running 0 7d 10.168.209.13 node1 <none> kube-system kube-proxy-lcckg 1/1 Running 0 7d 10.168.209.16 node4 <none> kube-system kube-proxy-pgblx 1/1 Running 1 7d 10.168.209.20 kube-master <none> kube-system kube-proxy-rnqq5 1/1 Running 0 7d 10.168.209.14 node2 <none> kube-system kube-proxy-wc959 1/1 Running 0 7d 10.168.209.15 node3 <none> kube-system kube-proxy-wfqqs 1/1 Running 0 7d 10.168.209.17 node5 <none> kube-system kube-scheduler-kube-master 1/1 Running 2 7d 10.168.209.20 kube-master <none> kube-system kubernetes-dashboard-6948bdb78-97qcq 0/1 ContainerCreating 0 7d <none> node5 <none> kube-system tiller-deploy-64c9d747bd-br9j7 0/1 ContainerCreating 0 45m <none> node4 <none>

当我打电话时:

kube-flannel-ds-amd64-42rl7

我的状态为kubectl describe pod -n kube-system kube-flannel-ds-amd64-42rl7

Running

此处Name: kube-flannel-ds-amd64-42rl7 Namespace: kube-system Priority: 0 PriorityClassName: <none> Node: node5/10.168.209.17 Start Time: Wed, 22 Aug 2018 16:47:10 +0300 Labels: app=flannel controller-revision-hash=911701653 pod-template-generation=1 tier=node Annotations: <none> Status: Running IP: 10.168.209.17 Controlled By: DaemonSet/kube-flannel-ds-amd64 Init Containers: install-cni: Container ID: docker://eb7ee47459a54d401969b1770ff45b39dc5768b0627eec79e189249790270169 Image: quay.io/coreos/flannel:v0.10.0-amd64 Image ID: docker-pullable://quay.io/coreos/flannel@sha256:88f2b4d96fae34bfff3d46293f7f18d1f9f3ca026b4a4d288f28347fcb6580ac Port: <none> Host Port: <none> Command: cp Args: -f /etc/kube-flannel/cni-conf.json /etc/cni/net.d/10-flannel.conflist State: Terminated Reason: Completed Exit Code: 0 Started: Wed, 22 Aug 2018 16:47:24 +0300 Finished: Wed, 22 Aug 2018 16:47:24 +0300 Ready: True Restart Count: 0 Environment: <none> Mounts: /etc/cni/net.d from cni (rw) /etc/kube-flannel/ from flannel-cfg (rw) /var/run/secrets/kubernetes.io/serviceaccount from flannel-token-9wmch (ro) Containers: kube-flannel: Container ID: docker://521b457c648baf10f01e26dd867b8628c0f0a0cc0ea416731de658e67628d54e Image: quay.io/coreos/flannel:v0.10.0-amd64 Image ID: docker-pullable://quay.io/coreos/flannel@sha256:88f2b4d96fae34bfff3d46293f7f18d1f9f3ca026b4a4d288f28347fcb6580ac Port: <none> Host Port: <none> Command: /opt/bin/flanneld Args: --ip-masq --kube-subnet-mgr State: Waiting Reason: CrashLoopBackOff Last State: Terminated Reason: Error Exit Code: 1 Started: Thu, 30 Aug 2018 10:15:04 +0300 Finished: Thu, 30 Aug 2018 10:15:08 +0300 Ready: False Restart Count: 2136 Limits: cpu: 100m memory: 50Mi Requests: cpu: 100m memory: 50Mi Environment: POD_NAME: kube-flannel-ds-amd64-42rl7 (v1:metadata.name) POD_NAMESPACE: kube-system (v1:metadata.namespace) Mounts: /etc/kube-flannel/ from flannel-cfg (rw) /run from run (rw) /var/run/secrets/kubernetes.io/serviceaccount from flannel-token-9wmch (ro) Conditions: Type Status Initialized True Ready False ContainersReady False PodScheduled True Volumes: run: Type: HostPath (bare host directory volume) Path: /run HostPathType: cni: Type: HostPath (bare host directory volume) Path: /etc/cni/net.d HostPathType: flannel-cfg: Type: ConfigMap (a volume populated by a ConfigMap) Name: kube-flannel-cfg Optional: false flannel-token-9wmch: Type: Secret (a volume populated by a Secret) SecretName: flannel-token-9wmch Optional: false QoS Class: Guaranteed Node-Selectors: beta.kubernetes.io/arch=amd64 Tolerations: node-role.kubernetes.io/master:NoSchedule node.kubernetes.io/disk-pressure:NoSchedule node.kubernetes.io/memory-pressure:NoSchedule node.kubernetes.io/not-ready:NoExecute node.kubernetes.io/unreachable:NoExecute Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Pulled 51m (x2128 over 7d) kubelet, node5 Container image "quay.io/coreos/flannel:v0.10.0-amd64" already present on machine Warning BackOff 1m (x48936 over 7d) kubelet, node5 Back-off restarting failed container

kube-controller-manager.yaml

OS是CentOS Linux版本7.5.1804

其中一个容器的日志:

apiVersion: v1
kind: Pod
metadata:
  annotations:
    scheduler.alpha.kubernetes.io/critical-pod: ""
  creationTimestamp: null
  labels:
    component: kube-controller-manager
    tier: control-plane
  name: kube-controller-manager
  namespace: kube-system
spec:
  containers:
  - command:
    - kube-controller-manager
    - --address=127.0.0.1
    - --allocate-node-cidrs=true
    - --cluster-cidr=192.168.0.0/24
    - --cluster-signing-cert-file=/etc/kubernetes/pki/ca.crt
    - --cluster-signing-key-file=/etc/kubernetes/pki/ca.key
    - --controllers=*,bootstrapsigner,tokencleaner
    - --kubeconfig=/etc/kubernetes/controller-manager.conf
    - --leader-elect=true
    - --node-cidr-mask-size=24
    - --root-ca-file=/etc/kubernetes/pki/ca.crt
    - --service-account-private-key-file=/etc/kubernetes/pki/sa.key
    - --use-service-account-credentials=true
    image: k8s.gcr.io/kube-controller-manager-amd64:v1.11.2
    imagePullPolicy: IfNotPresent
    livenessProbe:
      failureThreshold: 8
      httpGet:
        host: 127.0.0.1
        path: /healthz
        port: 10252
        scheme: HTTP
      initialDelaySeconds: 15
      timeoutSeconds: 15
    name: kube-controller-manager
    resources:
      requests:
        cpu: 200m
    volumeMounts:
    - mountPath: /etc/ssl/certs
      name: ca-certs
      readOnly: true
    - mountPath: /etc/kubernetes/controller-manager.conf
      name: kubeconfig
      readOnly: true
    - mountPath: /usr/libexec/kubernetes/kubelet-plugins/volume/exec
      name: flexvolume-dir
    - mountPath: /etc/pki
      name: etc-pki
      readOnly: true
    - mountPath: /etc/kubernetes/pki
      name: k8s-certs
      readOnly: true
  hostNetwork: true
  priorityClassName: system-cluster-critical
  volumes:
  - hostPath:
      path: /etc/ssl/certs
      type: DirectoryOrCreate
    name: ca-certs
  - hostPath:
      path: /etc/kubernetes/controller-manager.conf
      type: FileOrCreate
    name: kubeconfig
  - hostPath:
      path: /usr/libexec/kubernetes/kubelet-plugins/volume/exec
      type: DirectoryOrCreate
    name: flexvolume-dir
  - hostPath:
      path: /etc/pki
      type: DirectoryOrCreate
    name: etc-pki
  - hostPath:
      path: /etc/kubernetes/pki
      type: DirectoryOrCreate
    name: k8s-certs
status: {}

错误在哪里?

4 个答案:

答案 0 :(得分:2)

我有类似的问题。我执行了以下步骤使其正常工作:

  • 在工作节点上通过kubeadm reset从主节点删除节点。

  • 通过iptables清除iptables -F && iptables -t nat -F && iptables -t mangle -F && iptables -X规则。

  • 通过rm -rf $HOME/.kube/config清除配置文件。

  • 重新启动工作程序节点。

  • 通过swapoff -a禁用工作节点上的交换。

  • 再次加入主节点。

答案 1 :(得分:1)

要使flannel正常工作,必须将--pod-network-cidr=10.244.0.0/16传递给kubeadm init

答案 2 :(得分:0)

尝试一下:

未能获得租赁只是意味着,吊舱没有获得podCIDR。尽管主节点上的清单显示podCIDR是正确的,但我也发生了同样的事情,但仍然无法正常工作,并且漏斗进入了crashbackloop。 这就是我要解决的问题。

首先在主节点上找到漏斗CIDR

sudo cat /etc/kubernetes/manifests/kube-controller-manager.yaml | grep -i cluster-cidr

输出:

- --cluster-cidr=172.168.10.0/24

然后从主节点运行以下命令:

kubectl patch node slave-node-1 -p '{"spec":{"podCIDR":"172.168.10.0/24"}}'

在哪里, slave-node-1是获取租约失败的节点 podCIDR是您在上一个命令中找到的cidr

希望这会有所帮助。

答案 3 :(得分:0)

并确保将SELinux设置为“允许”或“禁用”。

 # getenforce
    Permissive