我正在尝试部署Kubernetes集群,我的主节点已启动并且正在运行,但是某些Pod处于挂起状态。下面是get pods的输出。
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-system calico-kube-controllers-65b4876956-29tj9 0/1 Pending 0 9h <none> <none> <none> <none>
kube-system calico-node-bf25l 2/2 Running 2 9h <none> master-0-eccdtest <none> <none>
kube-system coredns-7d6cf57b54-b55zw 0/1 Pending 0 9h <none> <none> <none> <none>
kube-system coredns-7d6cf57b54-bk6j5 0/1 Pending 0 12m <none> <none> <none> <none>
kube-system kube-apiserver-master-0-eccdtest 1/1 Running 1 9h <none> master-0-eccdtest <none> <none>
kube-system kube-controller-manager-master-0-eccdtest 1/1 Running 1 9h <none> master-0-eccdtest <none> <none>
kube-system kube-proxy-jhfjj 1/1 Running 1 9h <none> master-0-eccdtest <none> <none>
kube-system kube-scheduler-master-0-eccdtest 1/1 Running 1 9h <none> master-0-eccdtest <none> <none>
kube-system openstack-cloud-controller-manager-tlp4m 1/1 CrashLoopBackOff 114 9h <none> master-0-eccdtest <none> <none>
当我尝试查看pod日志时,出现以下错误。
Error from server: no preferred addresses found; known addresses: []
Kubectl get事件有很多警告。
NAMESPACE LAST SEEN TYPE REASON KIND MESSAGE
default 23m Normal Starting Node Starting kubelet.
default 23m Normal NodeHasSufficientMemory Node Node master-0-eccdtest status is now: NodeHasSufficientMemory
default 23m Normal NodeHasNoDiskPressure Node Node master-0-eccdtest status is now: NodeHasNoDiskPressure
default 23m Normal NodeHasSufficientPID Node Node master-0-eccdtest status is now: NodeHasSufficientPID
default 23m Normal NodeAllocatableEnforced Node Updated Node Allocatable limit across pods
default 23m Normal Starting Node Starting kube-proxy.
default 23m Normal RegisteredNode Node Node master-0-eccdtest event: Registered Node master-0-eccdtest in Controller
kube-system 26m Warning FailedScheduling Pod 0/1 nodes are available: 1 node(s) had taints that the pod didn't tolerate.
kube-system 3m15s Warning FailedScheduling Pod 0/1 nodes are available: 1 node(s) had taints that the pod didn't tolerate.
kube-system 25m Warning DNSConfigForming Pod Nameserver limits were exceeded, some nameservers have been omitted, the applied nameserver line is: 10.96.0.10 10.51.40.100 10.51.40.103
kube-system 23m Normal SandboxChanged Pod Pod sandbox changed, it will be killed and re-created.
kube-system 23m Normal Pulled Pod Container image "registry.eccd.local:5000/node:v3.6.1-26684321" already present on machine
kube-system 23m Normal Created Pod Created container
kube-system 23m Normal Started Pod Started container
kube-system 23m Normal Pulled Pod Container image "registry.eccd.local:5000/cni:v3.6.1-26684321" already present on machine
kube-system 23m Normal Created Pod Created container
kube-system 23m Normal Started Pod Started container
kube-system 23m Warning Unhealthy Pod Readiness probe failed: Threshold time for bird readiness check: 30s
calico/node is not ready: felix is not ready: Get http://localhost:9099/readiness: dial tcp [::1]:9099: connect: connection refused
kube-system 23m Warning Unhealthy Pod Liveness probe failed: Get http://localhost:9099/liveness: dial tcp [::1]:9099: connect: connection refused
kube-system 26m Warning FailedScheduling Pod 0/1 nodes are available: 1 node(s) had taints that the pod didn't tolerate.
kube-system 3m15s Warning FailedScheduling Pod 0/1 nodes are available: 1 node(s) had taints that the pod didn't tolerate.
kube-system 105s Warning FailedScheduling Pod 0/1 nodes are available: 1 node(s) had taints that the pod didn't tolerate.
kube-system 26m Warning FailedScheduling Pod 0/1 nodes are available: 1 node(s) had taints that the pod didn't tolerate.
kube-system 22m Warning FailedScheduling Pod 0/1 nodes are available: 1 node(s) had taints that the pod didn't tolerate.
kube-system 21m Warning FailedScheduling Pod skip schedule deleting pod: kube-system/coredns-7d6cf57b54-w95g4
kube-system 21m Normal SuccessfulCreate ReplicaSet Created pod: coredns-7d6cf57b54-bk6j5
kube-system 26m Warning DNSConfigForming Pod Nameserver limits were exceeded, some nameservers have been omitted, the applied nameserver line is: 10.96.0.10 10.51.40.100 10.51.40.103
kube-system 23m Normal SandboxChanged Pod Pod sandbox changed, it will be killed and re-created.
kube-system 23m Normal Pulled Pod Container image "registry.eccd.local:5000/kube-apiserver:v1.13.5-1-80cc0db3" already present on machine
kube-system 23m Normal Created Pod Created container
kube-system 23m Normal Started Pod Started container
kube-system 26m Warning DNSConfigForming Pod Nameserver limits were exceeded, some nameservers have been omitted, the applied nameserver line is: 10.96.0.10 10.51.40.100 10.51.40.103
kube-system 23m Normal SandboxChanged Pod Pod sandbox changed, it will be killed and re-created.
kube-system 23m Normal Pulled Pod Container image "registry.eccd.local:5000/kube-controller-manager:v1.13.5-1-80cc0db3" already present on machine
kube-system 23m Normal Created Pod Created container
kube-system 23m Normal Started Pod Started container
kube-system 23m Normal LeaderElection Endpoints master-0-eccdtest_ed8f0ece-a6cd-11e9-9dd7-fa163e182aab became leader
kube-system 26m Warning DNSConfigForming Pod Nameserver limits were exceeded, some nameservers have been omitted, the applied nameserver line is: 10.96.0.10 10.51.40.100 10.51.40.103
kube-system 23m Normal SandboxChanged Pod Pod sandbox changed, it will be killed and re-created.
kube-system 23m Normal Pulled Pod Container image "registry.eccd.local:5000/kube-proxy:v1.13.5-1-80cc0db3" already present on machine
kube-system 23m Normal Created Pod Created container
kube-system 23m Normal Started Pod Started container
kube-system 26m Warning DNSConfigForming Pod Nameserver limits were exceeded, some nameservers have been omitted, the applied nameserver line is: 10.96.0.10 10.51.40.100 10.51.40.103
kube-system 23m Normal SandboxChanged Pod Pod sandbox changed, it will be killed and re-created.
kube-system 23m Normal Pulled Pod Container image "registry.eccd.local:5000/kube-scheduler:v1.13.5-1-80cc0db3" already present on machine
kube-system 23m Normal Created Pod Created container
kube-system 23m Normal Started Pod Started container
kube-system 23m Normal LeaderElection Endpoints master-0-eccdtest_ee2520c1-a6cd-11e9-96a3-fa163e182aab became leader
kube-system 26m Warning DNSConfigForming Pod Nameserver limits were exceeded, some nameservers have been omitted, the applied nameserver line is: 10.96.0.10 10.51.40.100 10.51.40.103
kube-system 36m Warning BackOff Pod Back-off restarting failed container
kube-system 23m Normal SandboxChanged Pod Pod sandbox changed, it will be killed and re-created.
kube-system 20m Normal Pulled Pod Container image "registry.eccd.local:5000/openstack-cloud-controller-manager:v1.14.0-1-11023d82" already present on machine
kube-system 20m Normal Created Pod Created container
kube-system 20m Normal Started Pod Started container
kube-system 3m20s Warning BackOff Pod Back-off restarting failed container
reslov.conf中唯一的名称服务器是
nameserver 10.96.0.10
我已广泛使用google解决这些问题,但没有任何有效的解决方案。任何建议,将不胜感激。
TIA
答案 0 :(得分:1)
您的主要问题是0/1 nodes are available: 1 node(s) had taints that the pod didn't tolerate
警告消息。
由于node-role.kubernetes.io/master:NoSchedule
和node.kubernetes.io/not-ready:NoSchedule
taints
此异味会阻止在当前节点上调度Pod。
如果您希望能够在控制平面节点上安排广告连播,例如对于用于开发的单机Kubernetes集群,运行:
kubectl taint nodes instance-1 node-role.kubernetes.io/master-
kubectl taint nodes instance-1 node.kubernetes.io/not-ready:NoSchedule-
但是从我的战俘中最好:
-initiate cluster using kubeadm
-并将所有新的Pod安排在工作节点上。
sudo kubeadm init --pod-network-cidr=192.168.0.0/16
[init] Using Kubernetes version: v1.15.0
...
Your Kubernetes control-plane has initialized successfully!
$ mkdir -p $HOME/.kube
$ sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
$ sudo chown $(id -u):$(id -g) $HOME/.kube/config
$ kubectl apply -f https://docs.projectcalico.org/v3.7/manifests/calico.yaml
configmap/calico-config created
customresourcedefinition.apiextensions.k8s.io/felixconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamblocks.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/blockaffinities.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamhandles.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamconfigs.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/bgppeers.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/bgpconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ippools.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/hostendpoints.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/clusterinformations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/globalnetworkpolicies.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/globalnetworksets.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/networkpolicies.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/networksets.crd.projectcalico.org created
clusterrole.rbac.authorization.k8s.io/calico-kube-controllers created
clusterrolebinding.rbac.authorization.k8s.io/calico-kube-controllers created
clusterrole.rbac.authorization.k8s.io/calico-node created
clusterrolebinding.rbac.authorization.k8s.io/calico-node created
daemonset.extensions/calico-node created
serviceaccount/calico-node created
deployment.extensions/calico-kube-controllers created
serviceaccount/calico-kube-controllers created
-ADD worker node using kubeadm join string on slave node
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
instance-1 Ready master 21m v1.15.0
instance-2 Ready <none> 34s v1.15.0
$ kubectl get pods --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-system calico-kube-controllers-658558ddf8-v2rqx 1/1 Running 0 11m 192.168.23.129 instance-1 <none> <none>
kube-system calico-node-c2tkt 1/1 Running 0 11m 10.132.0.36 instance-1 <none> <none>
kube-system calico-node-dhc66 1/1 Running 0 107s 10.132.0.38 instance-2 <none> <none>
kube-system coredns-5c98db65d4-dqjm7 1/1 Running 0 22m 192.168.23.130 instance-1 <none> <none>
kube-system coredns-5c98db65d4-hh7vd 1/1 Running 0 22m 192.168.23.131 instance-1 <none> <none>
kube-system etcd-instance-1 1/1 Running 0 21m 10.132.0.36 instance-1 <none> <none>
kube-system kube-apiserver-instance-1 1/1 Running 0 21m 10.132.0.36 instance-1 <none> <none>
kube-system kube-controller-manager-instance-1 1/1 Running 0 21m 10.132.0.36 instance-1 <none> <none>
kube-system kube-proxy-qwvkq 1/1 Running 0 107s 10.132.0.38 instance-2 <none> <none>
kube-system kube-proxy-s9gng 1/1 Running 0 22m 10.132.0.36 instance-1 <none> <none>
kube-system kube-scheduler-instance-1 1/1 Running 0 21m 10.132.0.36 instance-1 <none> <n
答案 1 :(得分:0)
我已经解决了这个问题。我没有从主节点访问云控制器FQDN的权限。我在主服务器/etc/resolv.conf中添加了DNS条目,并且可以正常工作。