我正在尝试设置Kubernetes集群,但是我无法运行CoreDNS。我已经运行了以下命令来启动集群:
sudo swapoff -a
sudo sysctl net.bridge.bridge-nf-call-iptables=1
sudo kubeadm init
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
kubectl apply -f "https://cloud.weave.works/k8s/net?k8s- version=$(kubectl version | base64 | tr -d '\n')"
kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/master/src/deploy/recommended/kubernetes-dashboard.yaml
要使用kubectl get pods --all-namespaces
检查POD,我会得到
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-68fb79bcf6-6s5bp 0/1 CrashLoopBackOff 6 10m
kube-system coredns-68fb79bcf6-hckxq 0/1 CrashLoopBackOff 6 10m
kube-system etcd-myserver 1/1 Running 0 79m
kube-system kube-apiserver-myserver 1/1 Running 0 79m
kube-system kube-controller-manager-myserver 1/1 Running 0 79m
kube-system kube-proxy-9ls64 1/1 Running 0 80m
kube-system kube-scheduler-myserver 1/1 Running 0 79m
kube-system kubernetes-dashboard-77fd78f978-tqt8m 1/1 Running 0 80m
kube-system weave-net-zmhwg 2/2 Running 0 80m
因此,CoreDNS一直崩溃。我唯一发现的错误消息来自
/var/log/syslog
:
Oct 4 18:06:44 myserver kubelet[16397]: E1004 18:06:44.961409 16397 pod_workers.go:186] Error syncing pod c456a48b-c7c3-11e8-bf23-02426706c77f ("coredns-68fb79bcf6-6s5bp_kube-system(c456a48b-c7c3-11e8-bf23-02426706c77f)"), skipping: failed to "StartContainer" for "coredns" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=coredns pod=coredns-68fb79bcf6-6s5bp_kube-system(c456a48b-c7c3-11e8-bf23-02426706c77f)"
并且来自kubectl logs coredns-68fb79bcf6-6s5bp -n kube-system
:
.:53
2018/10/04 11:04:55 [INFO] CoreDNS-1.2.2
2018/10/04 11:04:55 [INFO] linux/amd64, go1.11, eb51e8b
CoreDNS-1.2.2
linux/amd64, go1.11, eb51e8b
2018/10/04 11:04:55 [INFO] plugin/reload: Running configuration MD5 = f65c4821c8a9b7b5eb30fa4fbc167769
2018/10/04 11:04:55 [FATAL] plugin/loop: Seen "HINFO IN 3256902131464476443.1309143030470211725." more than twice, loop detected
我发现有一些解决方案要发布
kubectl -n kube-system get deployment coredns -o yaml | \
sed 's/allowPrivilegeEscalation: false/allowPrivilegeEscalation: true/g' | \
kubectl apply -f -
并修改/etc/resolv.conf
以指向实际的DNS,而不是我也尝试过的localhost。
https://kubernetes.io/docs/setup/independent/troubleshooting-kubeadm/#pods-in-runcontainererror-crashloopbackoff-or-error-state中描述了该问题,我尝试了许多不同的Pod网络,但没有帮助。
我已经跑过sudo kubeadm reset && rm -rf ~/.kube/ && sudo kubeadm init
次。
我正在运行Ubuntu 16.04,Kubernetes 1.12和Docker 17.03。有什么想法吗?
答案 0 :(得分:14)
我也有同样的问题。
我已经通过删除coredns cm内的“ loop”插件解决了该问题。 但我不知道这云是否还会引起其他问题。
1,kubectl编辑cm coredns -n kube-system
2,delete ‘loop’ ,save and exit
3,通过以下方式重新启动coredns pod:kubectl delete pod coredns.... -n kube-system
答案 1 :(得分:3)
通过这种方式可以解决一些问题:
kubectl edit cm coredns -n kube-system
kubectl获取容器-n kube-system -oname | grep coredns | xargs kubectl delete -n kube-system
问题的原因在the link in the output of the coredns log中进行了说明。 您可以在此cmd的输出中找到此链接
kubectl记录coredns-7d9cd4f75b-cpwxp -n kube-system
此链接在CoreDNS-1.2.4的输出中。
我使用此cmd升级了CoreDNS
kubectl修补程序部署-n = kube-system coredns -p'{“ spec”:{“ template”:{“ spec”:{“ containers”:[{“ image”:“ k8s.gcr.io/coredns :1.2.4“,” name“:” coredns“,” resources“:{” limits“:{” memory“:” 1Gi“},” requests“:{” cpu“:” 100m“,” memory“: “ 70Mi”}}}]}}}}'
答案 2 :(得分:1)
是的,您是对的。已here和on GitHub描述了该问题。解决方案是升级Docker,禁用SElinux或将allowPrivilegeEscalation
修改为true
。
但是,今天我尝试重现您的问题,但无法执行此操作。
为您提供命令和输出,也许它将帮助您从一开始就创建一个有效的版本。
Docker版本17.03.2-ce,Kubernetes v1.12.0,Ubuntu 16.04,CoreDNS-1.2.2,在GCP中创建的实例。
#apt-get update && apt-get install -y mc ebtables ethtool docker.io apt-transport-https curl
#curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -
#cat <<EOF >/etc/apt/sources.list.d/kubernetes.list \
deb http://apt.kubernetes.io/ kubernetes-xenial main \
EOF
#apt-get update && apt-get install -y kubelet kubeadm kubectl
#kubeadm init
$mkdir -p $HOME/.kube
$sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
$sudo chown $(id -u):$(id -g) $HOME/.kube/config
$kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')"
已创建服务帐户/编织网
clusterrole.rbac.authorization.k8s.io/weave-net已创建
clusterrolebinding.rbac.authorization.k8s.io/weave-net已创建
role.rbac.authorization.k8s.io/weave-net已创建
rolebinding.rbac.authorization.k8s.io/weave-net已创建
daemonset.extensions / weave-net已创建
$kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system pod/coredns-576cbf47c7-6qbtq 0/1 Pending 0 79s
kube-system pod/coredns-576cbf47c7-jr6hb 0/1 Pending 0 79s
kube-system pod/etcd-kube-weave-master-1 1/1 Running 0 38s
kube-system pod/kube-apiserver-kube-weave-master-1 1/1 Running 0 28s
kube-system pod/kube-controller-manager-kube-weave-master-1 1/1 Running 0 30s
kube-system pod/kube-proxy-4p9l5 1/1 Running 0 79s
kube-system pod/kube-scheduler-kube-weave-master-1 1/1 Running 0 34s
kube-system pod/weave-net-z6mhw 0/2 ContainerCreating 0 8s
再过一分钟:
$kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system pod/coredns-576cbf47c7-6qbtq 1/1 Running 0 98s
kube-system pod/coredns-576cbf47c7-jr6hb 1/1 Running 0 98s
kube-system pod/etcd-kube-weave-master-1 1/1 Running 0 57s
kube-system pod/kube-apiserver-kube-weave-master-1 1/1 Running 0 47s
kube-system pod/kube-controller-manager-kube-weave-master-1 1/1 Running 0 49s
kube-system pod/kube-proxy-4p9l5 1/1 Running 0 98s
kube-system pod/kube-scheduler-kube-weave-master-1 1/1 Running 0 53s
kube-system pod/weave-net-z6mhw 2/2 Running 0 27s
Coredns吊舱说明:
kubectl describe pod/coredns-576cbf47c7-6qbtq -n kube-system
Name: coredns-576cbf47c7-6qbtq
Namespace: kube-system
Priority: 0
PriorityClassName: <none>
Node: kube-weave-master-1/10.154.0.8
Start Time: Fri, 05 Oct 2018 11:06:54 +0000
Labels: k8s-app=kube-dns
pod-template-hash=576cbf47c7
Annotations: <none>
Status: Running
IP: 10.32.0.3
Controlled By: ReplicaSet/coredns-576cbf47c7
Containers:
coredns:
Container ID: docker://db1712600b4c927b99063fa41bc36c3346c55572bd63730fc993f03379fa457b
Image: k8s.gcr.io/coredns:1.2.2
Image ID: docker-pullable://k8s.gcr.io/coredns@sha256:3e2be1cec87aca0b74b7668bbe8c02964a95a402e45ceb51b2252629d608d03a
Ports: 53/UDP, 53/TCP, 9153/TCP
Host Ports: 0/UDP, 0/TCP, 0/TCP
Args:
-conf
/etc/coredns/Corefile
State: Running
Started: Fri, 05 Oct 2018 11:06:57 +0000
Ready: True
Restart Count: 0
Limits:
memory: 170Mi
Requests:
cpu: 100m
memory: 70Mi
Liveness: http-get http://:8080/health delay=60s timeout=5s period=10s #success=1 #failure=5
Environment: <none>
Mounts:
/etc/coredns from config-volume (ro)
/var/run/secrets/kubernetes.io/serviceaccount from coredns-token-wp7tm (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
config-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: coredns
Optional: false
coredns-token-wp7tm:
Type: Secret (a volume populated by a Secret)
SecretName: coredns-token-wp7tm
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: CriticalAddonsOnly
node-role.kubernetes.io/master:NoSchedule
node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 23m (x12 over 24m) default-scheduler 0/1 nodes are available: 1 node(s) had taints that the pod didn't tolerate.
Normal Scheduled 23m default-scheduler Successfully assigned kube-system/coredns-576cbf47c7-6qbtq to kube-weave-master-1
Normal Pulled 23m kubelet, kube-weave-master-1 Container image "k8s.gcr.io/coredns:1.2.2" already present on machine
Normal Created 23m kubelet, kube-weave-master-1 Created container
Normal Started 23m kubelet, kube-weave-master-1 Started container
还要提供config.yaml
使用的kubeadm init --config config.yaml
,以更好地了解您在指定配置文件位置时遇到的问题。
答案 3 :(得分:1)
我认为仅仅从Kubernetes删除loop
函数并不是一个干净的方法。 CoreDNS Github实际上为解决此问题提供了一些指导。
他们在指南中建议了3种方法
- 将以下内容添加到kubelet:--resolv-conf。您的“真实” resolv.conf是包含上游服务器的实际IP,但没有本地/环回地址的地址。此标志告诉kubelet将备用resolv.conf传递给Pods。对于使用systemd-resolved的系统,/ run / systemd / resolve / resolv.conf通常是“真实” resolv.conf的位置,尽管根据您的发行版可能有所不同。
- 在主机节点上禁用本地DNS缓存,然后将/etc/resolv.conf恢复为原始值。
- 一个快速而肮脏的解决方法是编辑您的Corefile,以替换proxy。 /etc/resolv.conf和上游DNS的IP地址,例如proxy。 8.8.8.8。但这只能解决CoreDNS的问题,kubelet将继续将无效的resolv.conf转发到所有默认的dnsPolicy Pod,从而使他们无法解析DNS。
答案 4 :(得分:1)
我的解决方案是删除--network-plugin=cni
中的/var/lib/kubelet/kubeadm-flags.env
答案 5 :(得分:0)
我的解决方案是删除/var/lib/kubelet/kubeadmflags.env中的--network-plugin = cni,然后重新启动计算机,COredns将运行。祝你好运