我是kubernetes的新手,我正在尝试创建一个集群。但是,当我使用kubeadm命令配置了主服务器后,我发现Pod出现了一些错误,这导致主服务器始终处于NotReady状态。
所有似乎都源于kube-proxy无法列出端点和服务的事实……因此(或据我所知)无法更新iptables。
这是我的kubectl版本:
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.1", GitCommit:"b7394102d6ef778017f2ca4046abbaa23b88c290", GitTreeState:"clean", BuildDate:"2019-04-08T17:11:31Z", GoVersion:"go1.12.1", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.1", GitCommit:"b7394102d6ef778017f2ca4046abbaa23b88c290", GitTreeState:"clean", BuildDate:"2019-04-08T17:02:58Z", GoVersion:"go1.12.1", Compiler:"gc", Platform:"linux/amd64"}
这是来自kube-proxy pod的日志:
$ kubectl logs -n kube-system kube-proxy-xjxck
W0430 12:33:28.887260 1 server_others.go:267] Flag proxy-mode="" unknown, assuming iptables proxy
W0430 12:33:28.913671 1 node.go:113] Failed to retrieve node info: Unauthorized
I0430 12:33:28.915780 1 server_others.go:147] Using iptables Proxier.
W0430 12:33:28.916065 1 proxier.go:314] invalid nodeIP, initializing kube-proxy with 127.0.0.1 as nodeIP
W0430 12:33:28.916089 1 proxier.go:319] clusterCIDR not specified, unable to distinguish between internal and external traffic
I0430 12:33:28.917555 1 server.go:555] Version: v1.14.1
I0430 12:33:28.959345 1 conntrack.go:52] Setting nf_conntrack_max to 131072
I0430 12:33:28.960392 1 config.go:202] Starting service config controller
I0430 12:33:28.960444 1 controller_utils.go:1027] Waiting for caches to sync for service config controller
I0430 12:33:28.960572 1 config.go:102] Starting endpoints config controller
I0430 12:33:28.960609 1 controller_utils.go:1027] Waiting for caches to sync for endpoints config controller
E0430 12:33:28.970720 1 event.go:191] Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"fh-ubuntu01.159a40901fa85264", GenerateName:"", Namespace:"default", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Initializers:(*v1.Initializers)(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:"Node", Namespace:"", Name:"fh-ubuntu01", UID:"fh-ubuntu01", APIVersion:"", ResourceVersion:"", FieldPath:""}, Reason:"Starting", Message:"Starting kube-proxy.", Source:v1.EventSource{Component:"kube-proxy", Host:"fh-ubuntu01"}, FirstTimestamp:v1.Time{Time:time.Time{wall:0xbf2a2e0639406264, ext:334442672, loc:(*time.Location)(0x2703080)}}, LastTimestamp:v1.Time{Time:time.Time{wall:0xbf2a2e0639406264, ext:334442672, loc:(*time.Location)(0x2703080)}}, Count:1, Type:"Normal", EventTime:v1.MicroTime{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}': 'Unauthorized' (will not retry!)
E0430 12:33:28.970939 1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Endpoints: Unauthorized
E0430 12:33:28.971106 1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Service: Unauthorized
E0430 12:33:29.977038 1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Endpoints: Unauthorized
E0430 12:33:29.979890 1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Service: Unauthorized
E0430 12:33:30.980098 1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Endpoints: Unauthorized
现在,我以这种方式创建了一个新的ClusterRoleBinding:
$ kubectl create clusterrolebinding kube-proxy-binding --clusterrole=system:node-proxier --user=system:kube-proxy
如果我描述了ClusterRole,我会看到:
$ kubectl describe clusterrole system:node-proxier
Name: system:node-proxier
Labels: kubernetes.io/bootstrapping=rbac-defaults
Annotations: rbac.authorization.kubernetes.io/autoupdate: true
PolicyRule:
Resources Non-Resource URLs Resource Names Verbs
--------- ----------------- -------------- -----
events [] [] [create patch update]
nodes [] [] [get]
endpoints [] [] [list watch]
services [] [] [list watch]
所以用户“ system:kube-proxy”应该能够列出端点和服务,对吗?现在,如果我打印kube-proxy daemonSet的YAML文件,则会得到他的信息:
$ kubectl get configmap kube-proxy -n kube-system -o yaml
apiVersion: v1
data:
config.conf: |-
apiVersion: kubeproxy.config.k8s.io/v1alpha1
bindAddress: 0.0.0.0
clientConnection:
acceptContentTypes: ""
burst: 10
contentType: application/vnd.kubernetes.protobuf
kubeconfig: /var/lib/kube-proxy/kubeconfig.conf
qps: 5
clusterCIDR: ""
configSyncPeriod: 15m0s
conntrack:
max: null
maxPerCore: 32768
min: 131072
tcpCloseWaitTimeout: 1h0m0s
tcpEstablishedTimeout: 24h0m0s
enableProfiling: false
healthzBindAddress: 0.0.0.0:10256
hostnameOverride: ""
iptables:
masqueradeAll: false
masqueradeBit: 14
minSyncPeriod: 0s
syncPeriod: 30s
ipvs:
excludeCIDRs: null
minSyncPeriod: 0s
scheduler: ""
syncPeriod: 30s
kind: KubeProxyConfiguration
metricsBindAddress: 127.0.0.1:10249
mode: ""
nodePortAddresses: null
oomScoreAdj: -999
portRange: ""
resourceContainer: /kube-proxy
udpIdleTimeout: 250ms
winkernel:
enableDSR: false
networkName: ""
sourceVip: ""
kubeconfig.conf: |-
apiVersion: v1
kind: Config
clusters:
- cluster:
certificate-authority:
/var/run/secrets/kubernetes.io/serviceaccount/ca.crt
server: https://10.0.1.1:6443
name: default
contexts:
- context:
cluster: default
namespace: default
user: default
name: default
current-context: default
users:
- name: default
user:
tokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
kind: ConfigMap
metadata:
creationTimestamp: "2019-03-21T10:34:03Z"
labels:
app: kube-proxy
name: kube-proxy
namespace: kube-system
resourceVersion: "4458115"
selfLink: /api/v1/namespaces/kube-system/configmaps/kube-proxy
uid: d8a454fb-4bc4-11e9-b0b4-00155d044109
我可以看到使我感到困惑的“用户:默认” ...尝试与哪个用户进行身份验证?有一个名为“默认”的实际用户吗?
非常感谢您!
kubectl的输出得到po -n kube-system
$ kubectl get po - n kube-system
NAME READY STATUS RESTARTS AGE
coredns-fb8b8dccf-27qck 0/1 Pending 0 7d15h
coredns-fb8b8dccf-dd6bh 0/1 Pending 0 7d15h
kube-apiserver-fh-ubuntu01 1/1 Running 1 7d15h
kube-controller-manager-fh-ubuntu01 1/1 Running 0 7d15h
kube-proxy-xjxck 1/1 Running 0 43h
kube-scheduler-fh-ubuntu01 1/1 Running 1 7d15h
weave-net-psqh5 1/2 CrashLoopBackOff 2144 7d15h
集群健康看起来很健康
$ kubectl get cs
NAME STATUS MESSAGE ERROR
controller-manager Healthy ok
scheduler Healthy ok
etcd-2 Healthy {"health": "true"}
etcd-3 Healthy {"health": "true"}
etcd-0 Healthy {"health": "true"}
etcd-1 Healthy {"health": "true"}
答案 0 :(得分:1)
运行以下命令以检查集群运行状况
kubectl get cs
然后检查控制平面服务的状态
kubectl get po - n kube-system
问题似乎与weave-net-psqh5吊舱有关。了解为什么它进入了CrashLoop状态。
从weave-net-psqh5共享日志。