由于尝试访问kubernetes内部服务时发生错误,我在某些节点上的CoreDNS处于Crashloopback状态时遇到了问题。
这是一个使用Kubespray部署的新K8s集群,网络层是在Openstack上使用Kubernetes 1.12.5版进行编织。 我已经测试了到端点的连接,并且没有问题,例如达到10.2.70.14:6443。 但是从Pod到10.233.0.1:443的telnet失败。
预先感谢您的帮助
kubectl describe svc kubernetes
Name: kubernetes
Namespace: default
Labels: component=apiserver
provider=kubernetes
Annotations: <none>
Selector: <none>
Type: ClusterIP
IP: 10.233.0.1
Port: https 443/TCP
TargetPort: 6443/TCP
Endpoints: 10.2.70.14:6443,10.2.70.18:6443,10.2.70.27:6443 + 2 more...
Session Affinity: None
Events: <none>
从CoreDNS日志中获取
E0415 17:47:05.453762 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:311: Failed to list *v1.Service: Get https://10.233.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.233.0.1:443: connect: connection refused
E0415 17:47:05.456909 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:313: Failed to list *v1.Endpoints: Get https://10.233.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.233.0.1:443: connect: connection refused
E0415 17:47:06.453258 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:318: Failed to list *v1.Namespace: Get https://10.233.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.233.0.1:443: connect: connection refused
另外,从有问题的节点之一中检查kube-proxy的日志时,发现以下错误:
I0415 19:14:32.162909 1 graceful_termination.go:160] Trying to delete rs: 10.233.0.1:443/TCP/10.2.70.36:6443
I0415 19:14:32.162979 1 graceful_termination.go:171] Not deleting, RS 10.233.0.1:443/TCP/10.2.70.36:6443: 1 ActiveConn, 0 InactiveConn
I0415 19:14:32.162989 1 graceful_termination.go:160] Trying to delete rs: 10.233.0.1:443/TCP/10.2.70.18:6443
I0415 19:14:32.163017 1 graceful_termination.go:171] Not deleting, RS 10.233.0.1:443/TCP/10.2.70.18:6443: 1 ActiveConn, 0 InactiveConn
E0415 19:14:32.215707 1 proxier.go:430] Failed to execute iptables-restore for nat: exit status 1 (iptables-restore: line 7 failed
)
答案 0 :(得分:0)
我遇到了完全相同的问题,结果发现我的kubespray配置错误。特别是nginx入口设置ingress_nginx_host_network
事实证明,您必须设置ingress_nginx_host_network: true
(默认为false)
如果您不想重新运行整个kubespray脚本,请编辑nginx入口守护进程集
$ kubectl -n ingress-nginx edit ds ingress-nginx-controller
--report-node-internal-ip-address
:spec:
container:
args:
- /nginx-ingress-controller
- --configmap=$(POD_NAMESPACE)/ingress-nginx
- --tcp-services-configmap=$(POD_NAMESPACE)/tcp-services
- --udp-services-configmap=$(POD_NAMESPACE)/udp-services
- --annotations-prefix=nginx.ingress.kubernetes.io
- --report-node-internal-ip-address # <- new
serviceAccountName: ingress-nginx
相同的级别:serviceAccountName: ingress-nginx
hostNetwork: true # <- new
dnsPolicy: ClusterFirstWithHostNet # <- new
然后保存并退出:wq
,检查广告连播状态kubectl get pods --all-namespaces
。
来源: https://github.com/kubernetes-sigs/kubespray/issues/4357