k3sup加入成功但节点未加入,tls问题?

时间:2020-10-22 17:50:01

标签: k3s

我对kubernetes很陌生。在k3s / k3sup上的RPi4上运行,因为它似乎是手臂卡最可行的解决方案。主机操作系统是Raspberry Pi OS 32位LITE,全新安装。我有一个主人(2Go Ram)和两个奴隶/工人(与8Go)。

不太确定我做了什么不同的工作(我做了很多次试验和错误),但最后我得到了一个工作的从属设备,而一个奴隶未被识别。

这是我的安装脚本(在主服务器上运行):

sudo apt install dnsutils -y

curl -ssL https://get.k3sup.dev | sudo sh
curl -sSL https://dl.get-arkade.dev | sudo sh # not used yet

export KUBECONFIG=`pwd`/kubeconfig

# configure ssh between master and slaves
ssh-keygen -t rsa

ssh-copy-id -i ~/.ssh/id_rsa.pub pi@master.home
ssh-copy-id -i ~/.ssh/id_rsa.pub pi@slave1.home
ssh-copy-id -i ~/.ssh/id_rsa.pub pi@slave2.home

# install k3s / k3sup everywhere
k3sup install --ip $(dig +short master.home) --user $(whoami)
k3sup install --ip $(dig +short slave1.home) --user $(whoami)
k3sup install --ip $(dig +short slave2.home) --user $(whoami)

# slaves join the cluster, labeled as workers
k3sup join --ip $(dig +short slave1.home) --server-ip $(dig +short master.home) --user $(whoami)
sudo kubectl label node slave1 node-role.kubernetes.io/worker=worker

k3sup join --ip $(dig +short slave2.home) --server-ip $(dig +short master.home) --user $(whoami)
sudo kubectl label node slave2 node-role.kubernetes.io/worker=worker

k3sup联接产生如下输出:

pi@master:~ $ k3sup join --ip $(dig +short slave1.home) --server-ip $(dig +short master.home) --user $(whoami)
Running: k3sup join
Server IP: 192.168.1.9
Enter passphrase for '/home/pi/.ssh/id_rsa':
xxxxx....::server:yyyyy
Enter passphrase for '/home/pi/.ssh/id_rsa':
[INFO]  Finding release for channel v1.18
[INFO]  Using v1.18.10+k3s1 as release
[INFO]  Downloading hash https://github.com/rancher/k3s/releases/download/v1.18.10+k3s1/sha256sum-arm.txt
[INFO]  Downloading binary https://github.com/rancher/k3s/releases/download/v1.18.10+k3s1/k3s-armhf
[INFO]  Verifying binary download
[INFO]  Installing k3s to /usr/local/bin/k3s
[INFO]  Creating /usr/local/bin/kubectl symlink to k3s
[INFO]  Creating /usr/local/bin/crictl symlink to k3s
[INFO]  Creating /usr/local/bin/ctr symlink to k3s
[INFO]  Creating killall script /usr/local/bin/k3s-killall.sh
[INFO]  Creating uninstall script /usr/local/bin/k3s-agent-uninstall.sh
[INFO]  env: Creating environment file /etc/systemd/system/k3s-agent.service.env
[INFO]  systemd: Creating service file /etc/systemd/system/k3s-agent.service
[INFO]  systemd: Enabling k3s-agent unit
Created symlink /etc/systemd/system/multi-user.target.wants/k3s-agent.service → /etc/systemd/system/k3s-agent.service.
[INFO]  systemd: Starting k3s-agent
Logs: Created symlink /etc/systemd/system/multi-user.target.wants/k3s-agent.service → /etc/systemd/system/k3s-agent.service.
Output: [INFO]  Finding release for channel v1.18
[INFO]  Using v1.18.10+k3s1 as release
[INFO]  Downloading hash https://github.com/rancher/k3s/releases/download/v1.18.10+k3s1/sha256sum-arm.txt
[INFO]  Downloading binary https://github.com/rancher/k3s/releases/download/v1.18.10+k3s1/k3s-armhf
[INFO]  Verifying binary download
[INFO]  Installing k3s to /usr/local/bin/k3s
[INFO]  Creating /usr/local/bin/kubectl symlink to k3s
[INFO]  Creating /usr/local/bin/crictl symlink to k3s
[INFO]  Creating /usr/local/bin/ctr symlink to k3s
[INFO]  Creating killall script /usr/local/bin/k3s-killall.sh
[INFO]  Creating uninstall script /usr/local/bin/k3s-agent-uninstall.sh
[INFO]  env: Creating environment file /etc/systemd/system/k3s-agent.service.env
[INFO]  systemd: Creating service file /etc/systemd/system/k3s-agent.service
[INFO]  systemd: Enabling k3s-agent unit
[INFO]  systemd: Starting k3s-agent

到目前为止很好,不是吗?好吧....

pi@master:~ $ sudo kubectl get node -o wide
NAME              STATUS   ROLES    AGE     VERSION         INTERNAL-IP   EXTERNAL-IP   OS-IMAGE                         KERNEL-VERSION   CONTAINER-RUNTIME
master   Ready    master   7h9m    v1.18.10+k3s1   192.168.1.9   <none>        Raspbian GNU/Linux 10 (buster)   5.4.51-v7l+      containerd://1.3.3-k3s2
slave2   Ready    worker   6h46m   v1.18.10+k3s1   192.168.1.6   <none>        Raspbian GNU/Linux 10 (buster)   5.4.65-v7l+      containerd://1.3.3-k3s2

slave1在哪里?

journalctl -xeu k3s:


oct 22 17:20:10 master k3s[538]: I1022 17:20:10.692778     538 log.go:172] http: TLS handshake error from 192.168.1.5:56432: remote error: tls: bad certificate
[....]
oct 22 17:20:12 master k3s[538]: time="2020-10-22T17:20:12.197507915+02:00" level=info msg="Handling backend connection request [slave1]"
oct 22 17:20:12 master k3s[538]: I1022 17:20:12.731568     538 log.go:172] http: TLS handshake error from 192.168.1.5:56522: EOF
[....]
oct 22 17:20:12 master k3s[538]: time="2020-10-22T17:20:12.733176514+02:00" level=info msg="error in remotedialer server [400]: websocket: close 1006 (abnormal closure): unexpected EOF"
oct 22 17:22:31 master k3s[538]: E1022 17:22:31.380781     538 machine.go:331] failed to get cache information for node 0: open /sys/devices/system/cpu/cpu0/cache: no such file or directory
[....]
oct 22 18:22:31 master k3s[538]: I1022 18:22:31.904927     538 trace.go:116] Trace[69126570]: "GuaranteedUpdate etcd3" type:*core.Endpoints (started: 2020-10-22 18:22:31.3753156 +0200 CEST m=+20417.996929673) (total time: 529.51
oct 22 18:22:31 master k3s[538]: Trace[69126570]: [529.405364ms] [527.360568ms] Transaction committed
oct 22 18:22:31 master k3s[538]: I1022 18:22:31.905446     538 trace.go:116] Trace[2049611301]: "Update" url:/api/v1/namespaces/kube-system/endpoints/rancher.io-local-path,user-agent:local-path-provisioner/v0.0.0 (linux/arm) kub
oct 22 18:22:31 master k3s[538]: Trace[2049611301]: [530.448956ms] [529.964521ms] Object stored in database
oct 22 18:27:31 master k3s[538]: E1022 18:27:31.339315     538 machine.go:331] failed to get cache information for node 0: open /sys/devices/system/cpu/cpu0/cache: no such file or directory
[....]
oct 22 19:17:41 master k3s[538]: I1022 19:17:41.164977     538 log.go:172] http: TLS handshake error from 192.168.1.231:56800: write tcp 192.168.1.9:6443->192.168.1.231:56800: write: connection reset by peer
oct 22 19:17:41 master k3s[538]: I1022 19:17:41.165105     538 log.go:172] http: TLS handshake error from 192.168.1.231:56802: read tcp 192.168.1.9:6443->192.168.1.231:56802: read: connection reset by peer
oct 22 19:17:41 master k3s[538]: I1022 19:17:41.165278     538 log.go:172] http: TLS handshake error from 192.168.1.231:56801: read tcp 192.168.1.9:6443->192.168.1.231:56801: read: connection reset by peer
oct 22 19:17:41 master k3s[538]: I1022 19:17:41.165601     538 log.go:172] http: TLS handshake error from 192.168.1.231:56783: read tcp 192.168.1.9:6443->192.168.1.231:56783: read: connection reset by peer
oct 22 19:17:41 master k3s[538]: I1022 19:17:41.170027     538 log.go:172] http: TLS handshake error from 192.168.1.231:56789: write tcp 192.168.1.9:6443->192.168.1.231:56789: write: connection reset by peer
oct 22 19:17:41 master k3s[538]: I1022 19:17:41.170179     538 log.go:172] http: TLS handshake error from 192.168.1.231:56799: write tcp 192.168.1.9:6443->192.168.1.231:56799: write: connection reset by peer
oct 22 19:22:31 master k3s[538]: E1022 19:22:31.358419     538 machine.go:331] failed to get cache information for node 0: open /sys/devices/system/cpu/cpu0/cache: no such file or directory
[....]

etc...

好吧,休斯顿,我们遇到了问题... slave1出现TLS错误。为什么?那该怎么办?

预先感谢:-)

1 个答案:

答案 0 :(得分:0)

好吧,再次从头开始(到处都是新安装的操作系统),使用k3s而不是k3sup,我能够在几分钟内启动并运行节点:

pi@master:~ $ sudo cat /var/lib/rancher/k3s/server/node-token
xxx::server:yyy

pi@slave1:~ $ curl -sfL https://get.k3s.io | K3S_URL=https://master.home:6443 K3S_TOKEN=xxx::server:yyy sh -
pi@slave2:~ $ curl -sfL https://get.k3s.io | K3S_URL=https://master.home:6443 K3S_TOKEN=xxx::server:yyy sh -
pi@master:~ $ sudo kubectl get node -o wide
NAME              STATUS   ROLES    AGE     VERSION        INTERNAL-IP   EXTERNAL-IP   OS-IMAGE                         KERNEL-VERSION   CONTAINER-RUNTIME
master   Ready    master   57m     v1.18.9+k3s1   192.168.1.5   <none>        Raspbian GNU/Linux 10 (buster)   5.4.51-v7l+      containerd://1.3.3-k3s2
slave1   Ready    worker   9m15s   v1.18.9+k3s1   192.168.1.6   <none>        Raspbian GNU/Linux 10 (buster)   5.4.72-v7l+      containerd://1.3.3-k3s2
slave2   Ready    worker   9m43s   v1.18.9+k3s1   192.168.1.7   <none>        Raspbian GNU/Linux 10 (buster)   5.4.72-v7l+      containerd://1.3.3-k3s2

巨大的成功:-)

提醒一下,请注意,在所有节点上都需要在/boot/cmdline.txt中添加“ cgroup_memory = 1 cgroup_enable = memory”(无法提醒我在以前的安装中是否在所有地方都正确执行了此操作)一,我在一个节点上错过了它,症状是一样的)