kubernetes中的CrashLoopBackoff错误中的kube-controller-manager和kube-scheduler

时间:2019-05-13 14:18:08

标签: kubernetes kubectl kubeadm kube-controller-manager kube-scheduler

我在k8s中将calico用作CNI,我试图在3台服务器中部署一个主集群。我正在使用kubeadm,请遵循官方setup guide。但是发生了一些错误,kube-controller-managerkube-scheduler进入CrashLoopBackOff错误,无法正常运行。

我在每台服务器上都尝试过kubeadm reset,并重启了服务器,降级了Docker。

我使用kubeadm init --apiserver-advertise-address=192.168.213.128 --pod-network-cidr=192.168.0.0/16来初始化母版,然后运行kubectl apply -f https://docs.projectcalico.org/v3.3/getting-started/kubernetes/installation/hosted/rbac-kdd.yamlkubectl apply -f https://docs.projectcalico.org/v3.3/getting-started/kubernetes/installation/hosted/kubernetes-datastore/calico-networking/1.7/calico.yaml来启动印花棉布。

[root@k8s-master ~]# docker info
Containers: 20
 Running: 18
 Paused: 0
 Stopped: 2
Images: 10
Server Version: 18.09.6
Storage Driver: overlay2
 Backing Filesystem: xfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: systemd
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: bb71b10fd8f58240ca47fbb579b9d1028eea7c84
runc version: 2b18fe1d885ee5083ef9f0838fee39b62d653e30
init version: fec3683
Security Options:
 seccomp
  Profile: default
Kernel Version: 3.10.0-957.12.1.el7.x86_64
Operating System: CentOS Linux 7 (Core)
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 972.6MiB
Name: k8s-master
ID: RN6I:PP52:4WTU:UP7E:T3LF:MXVZ:EDBX:RSII:BIRW:36O2:CYJ3:FRV2
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
 127.0.0.0/8
Registry Mirrors:
 https://i70c3eqq.mirror.aliyuncs.com/
 https://docker.mirrors.ustc.edu.cn/
Live Restore Enabled: false
Product License: Community Engine
[root@k8s-master ~]# kubectl version
Client Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.1", GitCommit:"b7394102d6ef778017f2ca4046abbaa23b88c290", GitTreeState:"clean", BuildDate:"2019-04-08T17:11:31Z", GoVersion:"go1.12.1", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.1", GitCommit:"b7394102d6ef778017f2ca4046abbaa23b88c290", GitTreeState:"clean", BuildDate:"2019-04-08T17:02:58Z", GoVersion:"go1.12.1", Compiler:"gc", Platform:"linux/amd64"}
[root@k8s-master ~]# kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.1", GitCommit:"b7394102d6ef778017f2ca4046abbaa23b88c290", GitTreeState:"clean", BuildDate:"2019-04-08T17:08:49Z", GoVersion:"go1.12.1", Compiler:"gc", Platform:"linux/amd64"}
[root@k8s-master ~]# kubelet --version
Kubernetes v1.14.1
[root@k8s-master ~]# kubectl get no -A
NAME         STATUS   ROLES    AGE   VERSION
k8s-master   Ready    master   49m   v1.14.1
[root@k8s-master ~]# kubectl get pods -A
NAMESPACE     NAME                                 READY   STATUS             RESTARTS   AGE
kube-system   calico-node-xmc5t                    2/2     Running            0          27m
kube-system   coredns-6765558d84-945mt             1/1     Running            0          28m
kube-system   coredns-6765558d84-xz7lw             1/1     Running            0          28m
kube-system   coredns-fb8b8dccf-z87sl              1/1     Running            0          31m
kube-system   etcd-k8s-master                      1/1     Running            0          30m
kube-system   kube-apiserver-k8s-master            1/1     Running            0          29m
kube-system   kube-controller-manager-k8s-master   0/1     CrashLoopBackOff   8          30m
kube-system   kube-proxy-wp7n9                     1/1     Running            0          31m
kube-system   kube-scheduler-k8s-master            1/1     Running            7          29m

[root@k8s-master ~]# kubectl logs -n kube-system kube-controller-manager-k8s-master
I0513 13:49:51.836448       1 serving.go:319] Generated self-signed cert in-memory
I0513 13:49:52.988794       1 controllermanager.go:155] Version: v1.14.1
I0513 13:49:53.003873       1 secure_serving.go:116] Serving securely on 127.0.0.1:10257
I0513 13:49:53.005146       1 deprecated_insecure_serving.go:51] Serving insecurely on [::]:10252
I0513 13:49:53.008661       1 leaderelection.go:217] attempting to acquire leader lease  kube-system/kube-controller-manager...
I0513 13:50:12.687383       1 leaderelection.go:227] successfully acquired lease kube-system/kube-controller-manager
I0513 13:50:12.700344       1 event.go:209] Event(v1.ObjectReference{Kind:"Endpoints", Namespace:"kube-system", Name:"kube-controller-manager", UID:"39adc911-7582-11e9-a70e-000c2908c796", APIVersion:"v1", ResourceVersion:"1706", FieldPath:""}): type: 'Normal' reason: 'LeaderElection' k8s-master_fbfa0502-7585-11e9-9939-000c2908c796 became leader
I0513 13:50:13.131264       1 plugins.go:103] No cloud provider specified.
I0513 13:50:13.166088       1 controller_utils.go:1027] Waiting for caches to sync for tokens controller
I0513 13:50:13.368381       1 controllermanager.go:497] Started "podgc"
I0513 13:50:13.368666       1 gc_controller.go:76] Starting GC controller
I0513 13:50:13.368697       1 controller_utils.go:1027] Waiting for caches to sync for GC controller
I0513 13:50:13.368717       1 controller_utils.go:1034] Caches are synced for tokens controller
I0513 13:50:13.453276       1 controllermanager.go:497] Started "attachdetach"
I0513 13:50:13.453534       1 attach_detach_controller.go:323] Starting attach detach controller
I0513 13:50:13.453545       1 controller_utils.go:1027] Waiting for caches to sync for attach detach controller
I0513 13:50:13.461756       1 controllermanager.go:497] Started "clusterrole-aggregation"
I0513 13:50:13.461833       1 clusterroleaggregation_controller.go:148] Starting ClusterRoleAggregator
I0513 13:50:13.461849       1 controller_utils.go:1027] Waiting for caches to sync for ClusterRoleAggregator controller
I0513 13:50:13.517257       1 controllermanager.go:497] Started "endpoint"
I0513 13:50:13.525394       1 endpoints_controller.go:166] Starting endpoint controller
I0513 13:50:13.525425       1 controller_utils.go:1027] Waiting for caches to sync for endpoint controller
I0513 13:50:14.151371       1 resource_quota_monitor.go:228] QuotaMonitor created object count evaluator for rolebindings.rbac.authorization.k8s.io
I0513 13:50:14.151463       1 resource_quota_monitor.go:228] QuotaMonitor created object count evaluator for leases.coordination.k8s.io
I0513 13:50:14.151489       1 resource_quota_monitor.go:228] QuotaMonitor created object count evaluator for limitranges
I0513 13:50:14.163632       1 resource_quota_monitor.go:228] QuotaMonitor created object count evaluator for ingresses.extensions
I0513 13:50:14.163695       1 resource_quota_monitor.go:228] QuotaMonitor created object count evaluator for daemonsets.apps
I0513 13:50:14.163721       1 resource_quota_monitor.go:228] QuotaMonitor created object count evaluator for ingresses.networking.k8s.io
I0513 13:50:14.163742       1 resource_quota_monitor.go:228] QuotaMonitor created object count evaluator for poddisruptionbudgets.policy
W0513 13:50:14.163757       1 shared_informer.go:311] resyncPeriod 67689210101997 is smaller than resyncCheckPeriod 86008177281797 and the informer has already started. Changing it to 86008177281797
I0513 13:50:14.163840       1 resource_quota_monitor.go:228] QuotaMonitor created object count evaluator for networkpolicies.networking.k8s.io
W0513 13:50:14.163848       1 shared_informer.go:311] resyncPeriod 64017623179979 is smaller than resyncCheckPeriod 86008177281797 and the informer has already started. Changing it to 86008177281797
I0513 13:50:14.163867       1 resource_quota_monitor.go:228] QuotaMonitor created object count evaluator for serviceaccounts
I0513 13:50:14.163885       1 resource_quota_monitor.go:228] QuotaMonitor created object count evaluator for deployments.extensions
I0513 13:50:14.163911       1 resource_quota_monitor.go:228] QuotaMonitor created object count evaluator for daemonsets.extensions
I0513 13:50:14.163925       1 resource_quota_monitor.go:228] QuotaMonitor created object count evaluator for controllerrevisions.apps
I0513 13:50:14.163942       1 resource_quota_monitor.go:228] QuotaMonitor created object count evaluator for roles.rbac.authorization.k8s.io
I0513 13:50:14.163965       1 resource_quota_monitor.go:228] QuotaMonitor created object count evaluator for podtemplates
I0513 13:50:14.163994       1 resource_quota_monitor.go:228] QuotaMonitor created object count evaluator for cronjobs.batch
I0513 13:50:14.164004       1 resource_quota_monitor.go:228] QuotaMonitor created object count evaluator for endpoints
I0513 13:50:14.164019       1 resource_quota_monitor.go:228] QuotaMonitor created object count evaluator for replicasets.extensions
I0513 13:50:14.164030       1 resource_quota_monitor.go:228] QuotaMonitor created object count evaluator for replicasets.apps
I0513 13:50:14.164039       1 resource_quota_monitor.go:228] QuotaMonitor created object count evaluator for deployments.apps
I0513 13:50:14.164054       1 resource_quota_monitor.go:228] QuotaMonitor created object count evaluator for jobs.batch
I0513 13:50:14.164079       1 resource_quota_monitor.go:228] QuotaMonitor created object count evaluator for statefulsets.apps
I0513 13:50:14.164097       1 resource_quota_monitor.go:228] QuotaMonitor created object count evaluator for events.events.k8s.io
I0513 13:50:14.164115       1 resource_quota_monitor.go:228] QuotaMonitor created object count evaluator for horizontalpodautoscalers.autoscaling
E0513 13:50:14.164139       1 resource_quota_controller.go:171] initial monitor sync has error: [couldn't start monitor for resource "extensions/v1beta1, Resource=networkpolicies": unable to monitor quota for resource "extensions/v1beta1, Resource=networkpolicies", couldn't start monitor for resource "crd.projectcalico.org/v1, Resource=networkpolicies": unable to monitor quota for resource "crd.projectcalico.org/v1, Resource=networkpolicies"]
I0513 13:50:14.164154       1 controllermanager.go:497] Started "resourcequota"
I0513 13:50:14.171002       1 resource_quota_controller.go:276] Starting resource quota controller
I0513 13:50:14.171096       1 controller_utils.go:1027] Waiting for caches to sync for resource quota controller
I0513 13:50:14.171138       1 resource_quota_monitor.go:301] QuotaMonitor running
I0513 13:50:15.776814       1 controllermanager.go:497] Started "job"
I0513 13:50:15.771658       1 job_controller.go:143] Starting job controller
I0513 13:50:15.807719       1 controller_utils.go:1027] Waiting for caches to sync for job controller
I0513 13:50:23.065972       1 controllermanager.go:497] Started "csrcleaner"
I0513 13:50:23.047495       1 cleaner.go:81] Starting CSR cleaner controller
I0513 13:50:25.019036       1 event.go:209] Event(v1.ObjectReference{Kind:"Endpoints", Namespace:"kube-system", Name:"kube-controller-manager", UID:"39adc911-7582-11e9-a70e-000c2908c796", APIVersion:"v1", ResourceVersion:"1706", FieldPath:""}): type: 'Normal' reason: 'LeaderElection' k8s-master_fbfa0502-7585-11e9-9939-000c2908c796 stopped leading
I0513 13:50:25.125784       1 leaderelection.go:263] failed to renew lease kube-system/kube-controller-manager: failed to tryAcquireOrRenew context deadline exceeded
F0513 13:50:25.189307       1 controllermanager.go:260] leaderelection lost

[root@k8s-master ~]# kubectl logs -n kube-system kube-scheduler-k8s-master
I0513 14:16:04.350818       1 serving.go:319] Generated self-signed cert in-memory
W0513 14:16:06.203477       1 authentication.go:387] failed to read in-cluster kubeconfig for delegated authentication: open /var/run/secrets/kubernetes.io/serviceaccount/token: no such file or directory
W0513 14:16:06.215933       1 authentication.go:249] No authentication-kubeconfig provided in order to lookup client-ca-file in configmap/extension-apiserver-authentication in kube-system, so client certificate authentication won't work.
W0513 14:16:06.215947       1 authentication.go:252] No authentication-kubeconfig provided in order to lookup requestheader-client-ca-file in configmap/extension-apiserver-authentication in kube-system, so request-header client certificate authentication won't work.
W0513 14:16:06.218951       1 authorization.go:177] failed to read in-cluster kubeconfig for delegated authorization: open /var/run/secrets/kubernetes.io/serviceaccount/token: no such file or directory
W0513 14:16:06.218983       1 authorization.go:146] No authorization-kubeconfig provided, so SubjectAccessReview of authorization tokens won't work.
I0513 14:16:06.961417       1 server.go:142] Version: v1.14.1
I0513 14:16:06.974064       1 defaults.go:87] TaintNodesByCondition is enabled, PodToleratesNodeTaints predicate is mandatory
W0513 14:16:06.997875       1 authorization.go:47] Authorization is disabled
W0513 14:16:06.997889       1 authentication.go:55] Authentication is disabled
I0513 14:16:06.997908       1 deprecated_insecure_serving.go:49] Serving healthz insecurely on [::]:10251
I0513 14:16:06.998196       1 secure_serving.go:116] Serving securely on 127.0.0.1:10259
I0513 14:16:08.872649       1 controller_utils.go:1027] Waiting for caches to sync for scheduler controller
I0513 14:16:08.973148       1 controller_utils.go:1034] Caches are synced for scheduler controller
I0513 14:16:09.003227       1 leaderelection.go:217] attempting to acquire leader lease  kube-system/kube-scheduler...
I0513 14:16:25.814160       1 leaderelection.go:227] successfully acquired lease kube-system/kube-scheduler

kube-controller-managerkube-scheduler进入CrashLoopBackoff的原因是什么?怎样使kube-controller-managerkube-scheduler正常运行?

1 个答案:

答案 0 :(得分:0)

我复制了您在云VM上列出的步骤,并设法使其正常运行。

有一些想法可能会有所帮助:

  1. 确保满足所有 here

  2. 列出的先决条件
  3. 按照here中的指南安装最新版本的Docker (选择您使用的正确操作系统)

  4. 使用以下命令安装kubeadm:

     apt-get update && apt-get install -y apt-transport-https curl
     curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -
     cat <<EOF >/etc/apt/sources.list.d/kubernetes.list
     deb https://apt.kubernetes.io/ kubernetes-xenial main
     EOF
     apt-get update
     apt-get install -y kubelet kubeadm kubectl
     apt-mark hold kubelet kubeadm kubectl
    
  5. 执行以下命令,确保您获得了最新版本的kubeadm:apt-get update && apt-get upgrade

  6. 确保在kubeadm init

  7. 旁边使用正确的参数
  8. 别忘了运行:

    • mkdir -p $HOME/.kube

    • sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config

    • sudo chown $(id -u):$(id -g) $HOME/.kube/config

kubeadm init完成之后(这些命令也是kubeadm init输出的一部分)。

  1. 最后应用问题中列出的.yaml文件。

请注意,按照上述步骤操作,您将在{v1.14.3}中拥有kubectl versionkubelet --versionkubectl get no -A,而不是像显示的那样,在v1.14.1中都是这种情况。< / p>

希望对您有帮助。