我能够从主服务器上删除污点,但是即使在发出命令将其删除后,我用Kubeadmin安装了裸机的两个工作节点仍保留了无法访问的污点。它说已删除,但不是永久的。当我检查污点还在那儿时。我也尝试打补丁并将其设置为null,但这没有用。我在SO或其他任何地方找到的唯一东西都是master或假定这些命令有效。
更新:我检查了Taint的时间戳记,并在删除时再次添加了时间戳记。那么从什么意义上说节点是不可达的?我可以Ping它。我可以运行任何kubernetes诊断程序来查找它如何无法到达吗?我检查了是否可以在主节点和辅助节点之间ping通两种方式。因此,日志将在哪里显示错误,表明哪个组件无法连接?
kubectl describe no k8s-node1 | grep -i taint
Taints: node.kubernetes.io/unreachable:NoSchedule
尝试:
kubectl patch node k8s-node1 -p '{"spec":{"Taints":[]}}'
和
kubectl taint nodes --all node.kubernetes.io/unreachable:NoSchedule-
kubectl taint nodes --all node.kubernetes.io/unreachable:NoSchedule-
node/k8s-node1 untainted
node/k8s-node2 untainted
error: taint "node.kubernetes.io/unreachable:NoSchedule" not found
结果是说这两个工作程序节点没有污染,但是当我grep时,我又看到了它们
kubectl describe no k8s-node1 | grep -i taint
Taints: node.kubernetes.io/unreachable:NoSchedule
$ k get nodes
NAME STATUS ROLES AGE VERSION
k8s-master Ready master 10d v1.14.2
k8s-node1 NotReady <none> 10d v1.14.2
k8s-node2 NotReady <none> 10d v1.14.2
更新:发现有人遇到同样的问题,只能通过使用Kubeadmin重置集群来解决
https://forum.linuxfoundation.org/discussion/846483/lab2-1-kubectl-untainted-not-working
当然希望我不必在每次工作节点受污染时都这样做。
k describe node k8s-node2
Name: k8s-node2
Roles: <none>
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/os=linux
kubernetes.io/arch=amd64
kubernetes.io/hostname=k8s-node2
kubernetes.io/os=linux
Annotations: flannel.alpha.coreos.com/backend-data: {"VtepMAC":”d2:xx:61:c3:xx:16"}
flannel.alpha.coreos.com/backend-type: vxlan
flannel.alpha.coreos.com/kube-subnet-manager: true
flannel.alpha.coreos.com/public-ip: 10.xx.1.xx
kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock
node.alpha.kubernetes.io/ttl: 0
volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:2019年6月5日,星期三11:46:12 +0700
Taints: node.kubernetes.io/unreachable:NoSchedule
Unschedulable: false
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ------------------ ------------------- ---- -------
MemoryPressure Unknown Fri, 14 Jun 2019 10:34:07 +0700 Fri, 14 Jun 2019 10:35:09 +0700 NodeStatusUnknown Kubelet stopped posting node status.
DiskPressure Unknown Fri, 14 Jun 2019 10:34:07 +0700 Fri, 14 Jun 2019 10:35:09 +0700 NodeStatusUnknown Kubelet stopped posting node status.
PIDPressure Unknown Fri, 14 Jun 2019 10:34:07 +0700 Fri, 14 Jun 2019 10:35:09 +0700 NodeStatusUnknown Kubelet stopped posting node status.
Ready Unknown Fri, 14 Jun 2019 10:34:07 +0700 Fri, 14 Jun 2019 10:35:09 +0700 NodeStatusUnknown Kubelet stopped posting node status.
地址:
InternalIP: 10.10.10.xx
Hostname: k8s-node2
Capacity:
cpu: 2
ephemeral-storage: 26704124Ki
memory: 4096032Ki
pods: 110
Allocatable:
cpu: 2
ephemeral-storage: 24610520638
memory: 3993632Ki
pods: 110
System Info:
Machine ID: 6e4e4e32972b3b2f27f021dadc61d21
System UUID: 6e4e4ds972b3b2f27f0cdascf61d21
Boot ID: abfa0780-3b0d-sda9-a664-df900627be14
Kernel Version: 4.4.0-87-generic
OS Image: Ubuntu 16.04.3 LTS
Operating System: linux
Architecture: amd64
Container Runtime Version: docker://17.3.3
Kubelet Version: v1.14.2
Kube-Proxy Version: v1.14.2
PodCIDR: 10.xxx.10.1/24
Non-terminated Pods: (18 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits AGE
--------- ---- ------------ ---------- --------------- ------------- ---
heptio-sonobuoy sonobuoy-systemd-logs-daemon-set- 6a8d92061c324451-hnnp9 0 (0%) 0 (0%) 0 (0%) 0 (0%) 2d1h
istio-system istio-pilot-7955cdff46-w648c 110m (5%) 2100m (105%) 228Mi (5%) 1224Mi (31%) 6h55m
istio-system istio-telemetry-5c9cb76c56-twzf5 150m (7%) 2100m (105%) 228Mi (5%) 1124Mi (28%) 6h55m
istio-system zipkin-8594bbfc6b-9p2qc 0 (0%) 0 (0%) 1000Mi (25%) 1000Mi (25%) 6h55m
knative-eventing webhook-576479cc56-wvpt6 0 (0%) 0 (0%) 1000Mi (25%) 1000Mi (25%) 6h45m
knative-monitoring elasticsearch-logging-0 100m (5%) 1 (50%) 0 (0%) 0 (0%) 3d20h
knative-monitoring grafana-5cdc94dbd-mc4jn 100m (5%) 200m (10%) 100Mi (2%) 200Mi (5%) 3d21h
knative-monitoring kibana-logging-7cb6b64bff-dh8nx 100m (5%) 1 (50%) 0 (0%) 0 (0%) 3d20h
knative-monitoring kube-state-metrics-56f68467c9-vr5cx 223m (11%) 243m (12%) 176Mi (4%) 216Mi (5%) 3d21h
knative-monitoring node-exporter-7jw59 110m (5%) 220m (11%) 50Mi (1%) 90Mi (2%) 3d22h
knative-monitoring prometheus-system-0 0 (0%) 0 (0%) 400Mi (10%) 1000Mi (25%) 3d20h
knative-serving activator-6cfb97bccf-bfc4w 120m (6%) 2200m (110%) 188Mi (4%) 1624Mi (41%) 6h45m
knative-serving autoscaler-85749b6c48-4wf6z 130m (6%) 2300m (114%) 168Mi (4%) 1424Mi (36%) 6h45m
knative-serving controller-b49d69f4d-7j27s 100m (5%) 1 (50%) 100Mi (2%) 1000Mi (25%) 6h45m
knative-serving networking-certmanager-5b5d8f5dd8-qjh5q 100m (5%) 1 (50%) 100Mi (2%) 1000Mi (25%) 6h45m
knative-serving networking-istio-7977b9bbdd-vrpl5 100m (5%) 1 (50%) 100Mi (2%) 1000Mi (25%) 6h45m
kube-system canal-qbn67 250m (12%) 0 (0%) 0 (0%) 0 (0%) 10d
kube-system kube-proxy-phbf5 0 (0%) 0 (0%) 0 (0%) 0 (0%) 10d
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 1693m (84%) 14363m (718%)
memory 3838Mi (98%) 11902Mi (305%)
ephemeral-storage 0 (0%) 0 (0%)
Events: <none>
答案 0 :(得分:1)
问题是交换已在工作程序节点上打开,因此kublet崩溃退出了。这在/ var下的syslog文件中很明显,因此污点将被重新添加,直到解决为止。也许有人可以评论允许kublet在swap上运行的含义?
kubelet [29207]:F0616 06:25:05.597536 29207 server.go:265]无法运行Kubelet:不支持在swap上运行,请禁用swap!或将--fail-swap-on标志设置为false。 / proc /交换包含:[文件名#011#011#011#011类型#011#011大小#011Used#011优先级/ dev / xvda5分区#0114191228#0110#011-1] 6月16日06:25:05 k8s-node2 systemd [1]:kubelet.service:主进程已退出,代码=已退出,状态= 255 / n / a 6月16日06:25:05 k8s-node2 systemd [1]:kubelet.service:设备进入失败状态。 6月16日06:25:05 k8s-node2 systemd [1]:kubelet.service:失败,结果为“退出代码”。 6月16日06:25:15 k8s-node2 systemd [1]:kubelet.service:服务延期超时,计划重新启动。 6月16日06:25:15 k8s-node2 systemd [1]:停止的kubelet:Kubernetes节点代理。 6月16日06:25:15 k8s-node2 systemd [1]:启动了kubelet:Kubernetes节点代理。