我用kubeadm创建了一个集群,除Coredns以外的所有pod都已启动并正在运行,他始终处于CrashLoopBackOff状态。 Coredns无法正常启动。
NAME READY STATUS RESTARTS AGE
coredns-5c98db65d4-qx4mq 0/1 CrashLoopBackOff 3 81s
coredns-5c98db65d4-v5mg8 0/1 CrashLoopBackOff 3 81s
etcd-localhost.localdomain 1/1 Running 0 33s
kube-apiserver-localhost.localdomain 1/1 Running 0 22s
kube-controller-manager-localhost.localdomain 1/1 Running 0 40s
kube-flannel-ds-amd64-gltqj 1/1 Running 0 73s
kube-proxy-x2crp 1/1 Running 0 81s
kube-scheduler-localhost.localdomain 1/1 Running 0 15s
vm
2 cpu
4G memory
cat /etc/os-release
):centos 7.6
uname -a
):Linux localhost.localdomain 3.10.0-957.el7.x86_64
1,使用journalctl -f -u kubelet
-- Logs begin at 一 2019-08-05 14:29:46 CST. --
8月 05 16:43:29 localhost.localdomain kubelet[23907]: E0805 16:43:29.325790 23907 pod_workers.go:190] Error syncing pod 209945cb-f289-450b-9c25-c0cdc3655940 ("coredns-5c98db65d4-qx4mq_kube-system(209945cb-f289-450b-9c25-c0cdc3655940)"), skipping: failed to "StartContainer" for "coredns" with CrashLoopBackOff: "Back-off 1m20s restarting failed container=coredns pod=coredns-5c98db65d4-qx4mq_kube-system(209945cb-f289-450b-9c25-c0cdc3655940)"
8月 05 16:43:30 localhost.localdomain kubelet[23907]: E0805 16:43:30.337973 23907 pod_workers.go:190] Error syncing pod 209945cb-f289-450b-9c25-c0cdc3655940 ("coredns-5c98db65d4-qx4mq_kube-system(209945cb-f289-450b-9c25-c0cdc3655940)"), skipping: failed to "StartContainer" for "coredns" with CrashLoopBackOff: "Back-off 1m20s restarting failed container=coredns pod=coredns-5c98db65d4-qx4mq_kube-system(209945cb-f289-450b-9c25-c0cdc3655940)"
8月 05 16:43:31 localhost.localdomain kubelet[23907]: E0805 16:43:31.826577 23907 pod_workers.go:190] Error syncing pod 209945cb-f289-450b-9c25-c0cdc3655940 ("coredns-5c98db65d4-qx4mq_kube-system(209945cb-f289-450b-9c25-c0cdc3655940)"), skipping: failed to "StartContainer" for "coredns" with CrashLoopBackOff: "Back-off 1m20s restarting failed container=coredns pod=coredns-5c98db65d4-qx4mq_kube-system(209945cb-f289-450b-9c25-c0cdc3655940)"
8月 05 16:43:35 localhost.localdomain kubelet[23907]: E0805 16:43:35.781871 23907 pod_workers.go:190] Error syncing pod 34dc0078-481a-4d2d-b013-6c65a1ba8d5a ("coredns-5c98db65d4-v5mg8_kube-system(34dc0078-481a-4d2d-b013-6c65a1ba8d5a)"), skipping: failed to "StartContainer" for "coredns" with CrashLoopBackOff: "Back-off 1m20s restarting failed container=coredns pod=coredns-5c98db65d4-v5mg8_kube-system(34dc0078-481a-4d2d-b013-6c65a1ba8d5a)"
8月 05 16:43:44 localhost.localdomain kubelet[23907]: E0805 16:43:44.689542 23907 pod_workers.go:190] Error syncing pod 209945cb-f289-450b-9c25-c0cdc3655940 ("coredns-5c98db65d4-qx4mq_kube-system(209945cb-f289-450b-9c25-c0cdc3655940)"), skipping: failed to "StartContainer" for "coredns" with CrashLoopBackOff: "Back-off 1m20s restarting failed container=coredns pod=coredns-5c98db65d4-qx4mq_kube-system(209945cb-f289-450b-9c25-c0cdc3655940)"
8月 05 16:43:48 localhost.localdomain kubelet[23907]: E0805 16:43:48.690229 23907 pod_workers.go:190] Error syncing pod 34dc0078-481a-4d2d-b013-6c65a1ba8d5a ("coredns-5c98db65d4-v5mg8_kube-system(34dc0078-481a-4d2d-b013-6c65a1ba8d5a)"), skipping: failed to "StartContainer" for "coredns" with CrashLoopBackOff: "Back-off 1m20s restarting failed container=coredns pod=coredns-5c98db65d4-v5mg8_kube-system(34dc0078-481a-4d2d-b013-6c65a1ba8d5a)"
8月 05 16:43:58 localhost.localdomain kubelet[23907]: E0805 16:43:58.689996 23907 pod_workers.go:190] Error syncing pod 209945cb-f289-450b-9c25-c0cdc3655940 ("coredns-5c98db65d4-qx4mq_kube-system(209945cb-f289-450b-9c25-c0cdc3655940)"), skipping: failed to "StartContainer" for "coredns" with CrashLoopBackOff: "Back-off 1m20s restarting failed container=coredns pod=coredns-5c98db65d4-qx4mq_kube-system(209945cb-f289-450b-9c25-c0cdc3655940)"
8月 05 16:44:00 localhost.localdomain kubelet[23907]: E0805 16:44:00.690532 23907 pod_workers.go:190] Error syncing pod 34dc0078-481a-4d2d-b013-6c65a1ba8d5a ("coredns-5c98db65d4-v5mg8_kube-system(34dc0078-481a-4d2d-b013-6c65a1ba8d5a)"), skipping: failed to "StartContainer" for "coredns" with CrashLoopBackOff: "Back-off 1m20s restarting failed container=coredns pod=coredns-5c98db65d4-v5mg8_kube-system(34dc0078-481a-4d2d-b013-6c65a1ba8d5a)"
8月 05 16:44:12 localhost.localdomain kubelet[23907]: E0805 16:44:12.689339 23907 pod_workers.go:190] Error syncing pod 209945cb-f289-450b-9c25-c0cdc3655940 ("coredns-5c98db65d4-qx4mq_kube-system(209945cb-f289-450b-9c25-c0cdc3655940)"), skipping: failed to "StartContainer" for "coredns" with CrashLoopBackOff: "Back-off 1m20s restarting failed container=coredns pod=coredns-5c98db65d4-qx4mq_kube-system(209945cb-f289-450b-9c25-c0cdc3655940)"
8月 05 16:44:14 localhost.localdomain kubelet[23907]: E0805 16:44:14.690199 23907 pod_workers.go:190] Error syncing pod 34dc0078-481a-4d2d-b013-6c65a1ba8d5a ("coredns-5c98db65d4-v5mg8_kube-system(34dc0078-481a-4d2d-b013-6c65a1ba8d5a)"), skipping: failed to "StartContainer" for "coredns" with CrashLoopBackOff: "Back-off 1m20s restarting failed container=coredns pod=coredns-5c98db65d4-v5mg8_kube-system(34dc0078-481a-4d2d-b013-6c65a1ba8d5a)"
8月 05 16:44:25 localhost.localdomain kubelet[23907]: E0805 16:44:25.689499 23907 pod_workers.go:190] Error syncing pod 209945cb-f289-450b-9c25-c0cdc3655940 ("coredns-5c98db65d4-qx4mq_kube-system(209945cb-f289-450b-9c25-c0cdc3655940)"), skipping: failed to "StartContainer" for "coredns" with CrashLoopBackOff: "Back-off 1m20s restarting failed container=coredns pod=coredns-5c98db65d4-qx4mq_kube-system(209945cb-f289-450b-9c25-c0cdc3655940)"
2,使用kubectl describe pod coredns-5c98db65d4-v5mg8 -n kube-system
Name: coredns-5c98db65d4-v5mg8
Namespace: kube-system
Priority: 2000000000
Priority Class Name: system-cluster-critical
Node: localhost.localdomain/10.0.2.15
Start Time: Mon, 05 Aug 2019 16:42:01 +0800
Labels: k8s-app=kube-dns
pod-template-hash=5c98db65d4
Annotations: <none>
Status: Running
IP: 10.244.0.11
Controlled By: ReplicaSet/coredns-5c98db65d4
Containers:
coredns:
Container ID: docker://daf187222dfaa4d686dfd587e782369cb18c7de0c4de4850d8dd871b0dbe200c
Image: k8s.gcr.io/coredns:1.3.1
Image ID: docker://sha256:eb516548c180f8a6e0235034ccee2428027896af16a509786da13022fe95fe8c
Ports: 53/UDP, 53/TCP, 9153/TCP
Host Ports: 0/UDP, 0/TCP, 0/TCP
Args:
-conf
/etc/coredns/Corefile
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 139
Started: Mon, 05 Aug 2019 16:44:53 +0800
Finished: Mon, 05 Aug 2019 16:44:52 +0800
Ready: False
Restart Count: 5
Limits:
memory: 170Mi
Requests:
cpu: 100m
memory: 70Mi
Liveness: http-get http://:8080/health delay=60s timeout=5s period=10s #success=1 #failure=5
Readiness: http-get http://:8080/health delay=0s timeout=1s period=10s #success=1 #failure=3
Environment: <none>
Mounts:
/etc/coredns from config-volume (ro)
/var/run/secrets/kubernetes.io/serviceaccount from coredns-token-hzkdx (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
config-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: coredns
Optional: false
coredns-token-hzkdx:
Type: Secret (a volume populated by a Secret)
SecretName: coredns-token-hzkdx
Optional: false
QoS Class: Burstable
Node-Selectors: beta.kubernetes.io/os=linux
Tolerations: CriticalAddonsOnly
node-role.kubernetes.io/master:NoSchedule
node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 5m7s (x3 over 5m27s) default-scheduler 0/1 nodes are available: 1 node(s) had taints that the pod didn't tolerate.
Normal Scheduled 5m5s default-scheduler Successfully assigned kube-system/coredns-5c98db65d4-v5mg8 to localhost.localdomain
Normal Pulled 3m42s (x5 over 5m4s) kubelet, localhost.localdomain Container image "k8s.gcr.io/coredns:1.3.1" already present on machine
Normal Created 3m42s (x5 over 5m4s) kubelet, localhost.localdomain Created container coredns
Normal Started 3m41s (x5 over 5m3s) kubelet, localhost.localdomain Started container coredns
Warning BackOff 3m40s (x10 over 5m1s) kubelet, localhost.localdomain Back-off restarting failed container
环境:
-Kubernetes版本(使用kubectl version
):
Client Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.1", GitCommit:"4485c6f18cee9a5d3c3b4e523bd27972b1b53892", GitTreeState:"clean", BuildDate:"2019-07-18T09:18:22Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.0", GitCommit:"e8462b5b5dc2584fdcd18e6bcfe9f1e4d970a529", GitTreeState:"clean", BuildDate:"2019-06-19T16:32:14Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"linux/amd64"}
3,使用kubectl logs coredns-5c98db65d4-qx4mq -n kube-system
empty
4,使用docker version
Client: Docker Engine - Community
Version: 19.03.1
API version: 1.39 (downgraded from 1.40)
Go version: go1.12.5
Git commit: 74b1e89
Built: Thu Jul 25 21:21:07 2019
OS/Arch: linux/amd64
Experimental: false
Server: Docker Engine - Community
Engine:
Version: 18.09.7
API version: 1.39 (minimum version 1.12)
Go version: go1.10.8
Git commit: 2d0083d
Built: Thu Jun 27 17:26:28 2019
OS/Arch: linux/amd64
Experimental: false
5,显示CoreDns yaml文件
# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
annotations:
deployment.kubernetes.io/revision: "2"
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"extensions/v1beta1","kind":"Deployment","metadata":{"annotations":{"deployment.kubernetes.io/revision":"1"},"creationTimestamp":"2019-08-05T10:53:11Z","generation":1,"labels":{"k8s-app":"kube-dns"},"name":"coredns","namespace":"kube-system","resourceVersion":"930","selfLink":"/apis/extensions/v1beta1/namespaces/kube-system/deployments/coredns","uid":"7250d0fc-7827-4910-bf87-f8340cde9f09"},"spec":{"progressDeadlineSeconds":600,"replicas":2,"revisionHistoryLimit":10,"selector":{"matchLabels":{"k8s-app":"kube-dns"}},"strategy":{"rollingUpdate":{"maxSurge":"25%","maxUnavailable":1},"type":"RollingUpdate"},"template":{"metadata":{"creationTimestamp":null,"labels":{"k8s-app":"kube-dns"}},"spec":{"containers":[{"args":["-conf","/etc/coredns/Corefile"],"image":"k8s.gcr.io/coredns:1.3.1","imagePullPolicy":"IfNotPresent","livenessProbe":{"failureThreshold":5,"httpGet":{"path":"/health","port":8080,"scheme":"HTTP"},"initialDelaySeconds":60,"periodSeconds":10,"successThreshold":1,"timeoutSeconds":5},"name":"coredns","ports":[{"containerPort":53,"name":"dns","protocol":"UDP"},{"containerPort":53,"name":"dns-tcp","protocol":"TCP"},{"containerPort":9153,"name":"metrics","protocol":"TCP"}],"readinessProbe":{"failureThreshold":3,"httpGet":{"path":"/health","port":8080,"scheme":"HTTP"},"periodSeconds":10,"successThreshold":1,"timeoutSeconds":1},"resources":{"limits":{"memory":"170Mi"},"requests":{"cpu":"100m","memory":"70Mi"}},"securityContext":{"allowPrivilegeEscalation":true,"capabilities":{"add":["NET_BIND_SERVICE"],"drop":["all"]},"readOnlyRootFilesystem":true},"terminationMessagePath":"/dev/termination-log","terminationMessagePolicy":"File","volumeMounts":[{"mountPath":"/etc/coredns","name":"config-volume","readOnly":true}]}],"dnsPolicy":"Default","nodeSelector":{"beta.kubernetes.io/os":"linux"},"priorityClassName":"system-cluster-critical","restartPolicy":"Always","schedulerName":"default-scheduler","securityContext":{},"serviceAccount":"coredns","serviceAccountName":"coredns","terminationGracePeriodSeconds":30,"tolerations":[{"key":"CriticalAddonsOnly","operator":"Exists"},{"effect":"NoSchedule","key":"node-role.kubernetes.io/master"}],"volumes":[{"configMap":{"defaultMode":420,"items":[{"key":"Corefile","path":"Corefile"}],"name":"coredns"},"name":"config-volume"}]}}},"status":{"conditions":[{"lastTransitionTime":"2019-08-05T10:53:26Z","lastUpdateTime":"2019-08-05T10:53:26Z","message":"Deployment does not have minimum availability.","reason":"MinimumReplicasUnavailable","status":"False","type":"Available"},{"lastTransitionTime":"2019-08-06T01:45:12Z","lastUpdateTime":"2019-08-06T01:45:12Z","message":"ReplicaSet \"coredns-5c98db65d4\" has timed out progressing.","reason":"ProgressDeadlineExceeded","status":"False","type":"Progressing"}],"observedGeneration":1,"replicas":2,"unavailableReplicas":2,"updatedReplicas":2}}
creationTimestamp: "2019-08-05T10:53:11Z"
generation: 2
labels:
k8s-app: kube-dns
name: coredns
namespace: kube-system
resourceVersion: "1334"
selfLink: /apis/extensions/v1beta1/namespaces/kube-system/deployments/coredns
uid: 7250d0fc-7827-4910-bf87-f8340cde9f09
spec:
progressDeadlineSeconds: 600
replicas: 2
revisionHistoryLimit: 10
selector:
matchLabels:
k8s-app: kube-dns
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 1
type: RollingUpdate
template:
metadata:
creationTimestamp: null
labels:
k8s-app: kube-dns
spec:
containers:
- args:
- -conf
- /etc/coredns/Corefile
image: k8s.gcr.io/coredns:1.3.1
imagePullPolicy: IfNotPresent
"/tmp/kubectl-edit-t5zl3.yaml" 135L, 6559C
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
resources:
limits:
memory: 170Mi
requests:
cpu: 100m
memory: 70Mi
securityContext:
allowPrivilegeEscalation: true
capabilities:
add:
- NET_BIND_SERVICE
drop:
- all
readOnlyRootFilesystem: true
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /etc/coredns
name: config-volume
readOnly: true
dnsPolicy: Default
nodeSelector:
beta.kubernetes.io/os: linux
priorityClassName: system-cluster-critical
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: coredns
serviceAccountName: coredns
terminationGracePeriodSeconds: 30
tolerations:
- key: CriticalAddonsOnly
operator: Exists
- effect: NoSchedule
key: node-role.kubernetes.io/master
volumes:
- configMap:
defaultMode: 420
items:
- key: Corefile
path: Corefile
name: coredns
name: config-volume
status:
conditions:
- lastTransitionTime: "2019-08-05T10:53:26Z"
lastUpdateTime: "2019-08-05T10:53:26Z"
message: Deployment does not have minimum availability.
reason: MinimumReplicasUnavailable
status: "False"
type: Available
- lastTransitionTime: "2019-08-06T01:50:20Z"
lastUpdateTime: "2019-08-06T01:50:20Z"
message: ReplicaSet "coredns-7688bbffb9" is progressing.
reason: ReplicaSetUpdated
status: "True"
type: Progressing
observedGeneration: 2
replicas: 3
unavailableReplicas: 3
updatedReplicas: 2
7,使用kubeadm init --config ./kubeadm.yml --ignore-preflight-errors=Swap
初始化集群
kubeadm.yml:
apiVersion: kubeadm.k8s.io/v1beta2
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 10.0.2.15
nodeRegistration:
taints:
- effect: PreferNoSchedule
key: node-role.kubernetes.io/master
---
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
kubernetesVersion: v1.15.0
networking:
podSubnet: 10.244.0.0/16
答案 0 :(得分:0)
验证节点是否处于就绪状态
kubectl get nodes -o wide
验证您的法兰绒网络(默认情况下应为10.244.0.0/16)。
应该在集群初始化--pod-network-cidr=10.244.0.0/16
期间进行设置。
也请参考:
看看这个例子:
如果安装的Docker版本低于1.12.1,请在使用systemd引导dockerd并重新启动docker时删除MountFlags = slave选项。
如果这些步骤对您没有帮助,请提供有关群集初始化和配置的更多详细信息,以重新创建此问题。
我使用以下设置重现了这种情况。
uname -a
Linux g-dvmpku-0 3.10.0-957.27.2.el7.x86_64
cat /etc/centos-release
CentOS Linux release 7.6.1810 (Core)
kubectl version:
Client Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.1", GitCommit:"4485c6f18cee9a5d3c3b4e523bd27972b1b53892", GitTreeState:"clean", BuildDate:"2019-07-18T09:18:22Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.0", GitCommit:"e8462b5b5dc2584fdcd18e6bcfe9f1e4d970a529", GitTreeState:"clean", BuildDate:"2019-06-19T16:32:14Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"linux/amd64"}
kubeadm version: &version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.1", GitCommit:"4485c6f18cee9a5d3c3b4e523bd27972b1b53892", GitTreeState:"clean", BuildDate:"2019-07-18T09:15:32Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"linux/amd64"}
kubelet version: Kubernetes v1.15.1
kubectl get pods --all-namespaces:
kube-system coredns-5c98db65d4-grgcq 1/1 Running 0 5m22s
kube-system coredns-5c98db65d4-gvk4w 1/1 Running 0 5m22s
kube-system etcd-g-dvmpku-0 1/1 Running 0 4m44s
kube-system kube-apiserver-g-dvmpku-0 1/1 Running 0 4m32s
kube-system kube-controller-manager-g-dvmpku-0 1/1 Running 0 4m36s
kube-system kube-flannel-ds-amd64-zhb9v 1/1 Running 0 4m38s
kube-system kube-proxy-6mdmr 1/1 Running 0 5m22s
kube-system kube-scheduler-g-dvmpku-0 1/1 Running 0 4m25s
一切正常。
我想知道coredns图片:
In your case:
Image ID: docker://sha256:eb516548c180f8a6e0235034ccee2428027896af16a509786da13022fe95fe8c
While in my case:
Image ID: docker-pullable://k8s.gcr.io/coredns@sha256:02382353821b12c21b062c59184e227e001079bb13ebd01f9d3270ba0fcbf1e4
如果我错了,请与更多细节合作。