我是Kubernetes的菜鸟。我正在尝试按照一些方法来启动和运行一个小型集群,但我遇到了麻烦......
我有一个主节点和(4)节点,都运行Ubuntu 16.04
在所有节点上安装了docker:
$ sudo apt-get update
$ sudo apt-get install -y \
apt-transport-https \
ca-certificates \
curl \
software-properties-common
$ sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg | apt-key add -
$ sudo add-apt-repository \
"deb https://download.docker.com/linux/$(. /etc/os-release; echo "$ID") \
$(lsb_release -cs) \
stable"
$ sudo apt-get update && apt-get install -y docker-ce=$(apt-cache madison docker-ce | grep 17.03 | head -1 | awk '{print $3}')
$ sudo docker version
Client:
Version: 17.12.1-ce
API version: 1.35
Go version: go1.9.4
Git commit: 7390fc6
Built: Tue Feb 27 22:17:40 2018
OS/Arch: linux/amd64
Server:
Engine:
Version: 17.12.1-ce
API version: 1.35 (minimum version 1.12)
Go version: go1.9.4
Git commit: 7390fc6
Built: Tue Feb 27 22:16:13 2018
OS/Arch: linux/amd64
Experimental: false
关闭所有节点上的交换
$ sudo swapoff -a
在/ etc / fstab
中注释了swap挂载$ sudo vi /etc/fstab
$ mount -a
安装了kubeadm&所有节点上的kubectl:
$ sudo curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -
$ sudo cat <<EOF >/etc/apt/sources.list.d/kubernetes.list
deb http://apt.kubernetes.io/ kubernetes-xenial main
EOF
$ sudo apt-get update
$ sudo apt-get install -y kubeadm kubectl
$ kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.4",
GitCommit:"bee2d1505c4fe820744d26d41ecd3fdd4a3d6546", GitTreeState:"clean",
BuildDate:"2018-03-12T16:21:35Z", GoVersion:"go1.9.3", Compiler:"gc",
Platform:"linux/amd64"}
将其下载并解压缩到master和所有节点上的/ usr / local / bin中:https://github.com/kubernetes-incubator/cri-tools/releases
在所有节点上安装了etcd 3.3.0:
$ sudo groupadd --system etcd
$ sudo useradd --home-dir "/var/lib/etcd" \
--system \
--shell /bin/false \
-g etcd \
etcd
$ sudo mkdir -p /etc/etcd
$ sudo chown etcd:etcd /etc/etcd
$ sudo mkdir -p /var/lib/etcd
$ sudo chown etcd:etcd /var/lib/etcd
$ sudo rm -rf /tmp/etcd && mkdir -p /tmp/etcd
$ sudo curl -L https://github.com/coreos/etcd/releases/download/v3.3.0/etcd- v3.3.0-linux-amd64.tar.gz -o /tmp/etcd-3.3.0-linux-amd64.tar.gz
$ sudo tar xzvf /tmp/etcd-3.3.0-linux-amd64.tar.gz -C /tmp/etcd --strip-components=1
$ sudo cp /tmp/etcd/etcd /usr/bin/etcd
$ sudo cp /tmp/etcd/etcdctl /usr/bin/etcdctl
注意到主人的知识产权:
$ sudo ifconfig -a eth0
eth0 Link encap:Ethernet HWaddr 1e:00:51:00:00:28
inet addr:172.20.43.30 Bcast:172.20.43.255 Mask:255.255.254.0
inet6 addr: fe80::27b5:3d06:94c9:9d0/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:3194023 errors:0 dropped:0 overruns:0 frame:0
TX packets:3306456 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:338523846 (338.5 MB) TX bytes:3682444019 (3.6 GB)
在主人身上初始化kubernetes:
$ sudo kubeadm init --pod-network-cidr=172.20.43.0/16 \
--apiserver-advertise-address=172.20.43.30 \
--ignore-preflight-errors=cri \
--kubernetes-version stable-1.9
[init] Using Kubernetes version: v1.9.4
[init] Using Authorization modes: [Node RBAC]
[preflight] Running pre-flight checks.
[WARNING CRI]: unable to check if the container runtime at "/var/run/dockershim.sock" is running: exit status 1
[certificates] Generated ca certificate and key.
[certificates] Generated apiserver certificate and key.
[certificates] apiserver serving cert is signed for DNS names [jenkins-kube- master kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 172.20.43.30]
[certificates] Generated apiserver-kubelet-client certificate and key.
[certificates] Generated sa key and public key.
[certificates] Generated front-proxy-ca certificate and key.
[certificates] Generated front-proxy-client certificate and key.
[certificates] Valid certificates and keys now exist in "/etc/kubernetes/pki"
[kubeconfig] Wrote KubeConfig file to disk: "admin.conf"
[kubeconfig] Wrote KubeConfig file to disk: "kubelet.conf"
[kubeconfig] Wrote KubeConfig file to disk: "controller-manager.conf"
[kubeconfig] Wrote KubeConfig file to disk: "scheduler.conf"
[controlplane] Wrote Static Pod manifest for component kube-apiserver to "/etc/kubernetes/manifests/kube-apiserver.yaml"
[controlplane] Wrote Static Pod manifest for component kube-controller-manager to "/etc/kubernetes/manifests/kube-controller-manager.yaml"
[controlplane] Wrote Static Pod manifest for component kube-scheduler to "/etc/kubernetes/manifests/kube-scheduler.yaml"
[etcd] Wrote Static Pod manifest for a local etcd instance to "/etc/kubernetes/manifests/etcd.yaml"
[init] Waiting for the kubelet to boot up the control plane as Static Pods from directory "/etc/kubernetes/manifests".
[init] This might take a minute or longer if the control plane images have to be pulled.
[apiclient] All control plane components are healthy after 37.502640 seconds
[uploadconfig] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[markmaster] Will mark node jenkins-kube-master as master by adding a label and a taint
[markmaster] Master jenkins-kube-master tainted and labelled with key/value: node-role.kubernetes.io/master=""
[bootstraptoken] Using token: 6be4b1.9a8dacf89f71e53c
[bootstraptoken] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstraptoken] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstraptoken] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstraptoken] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[addons] Applied essential addon: kube-dns
[addons] Applied essential addon: kube-proxy
Your Kubernetes master has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
You can now join any number of machines by running the following on each node
as root:
kubeadm join --token 6be4b1.9a8dacf89f71e53c 172.20.43.30:6443 --discovery-token-ca-cert-hash sha256:524d29b032d7bfd319b147ab03a936bd429805258425bccca749de71bcb1efaf
主节点上的:
$ sudo cp /etc/kubernetes/admin.conf $HOME/.kube/config
$ sudo chown $(id -u):$(id -g) $HOME/.kube/config
$ export KUBECONFIG=$HOME/.kube/config
$ echo "export KUBECONFIG=$HOME/.kube/config" | tee -a ~/.bashrc
设置法兰绒用于网络连接:
$ sudo kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
clusterrole "flannel" created
clusterrolebinding "flannel" created
serviceaccount "flannel" created
configmap "kube-flannel-cfg" created
daemonset "kube-flannel-ds" created
$ sudo kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/k8s-manifests/kube-flannel-rbac.yml
clusterrole "flannel" configured
clusterrolebinding "flannel" configured
将节点加入到每个节点上运行此节点的集群:
$ sudo kubeadm join --token 6be4b1.9a8dacf89f71e53c 172.20.43.30:6443 \
--discovery-token-ca-cert-hash sha256:524d29b032d7bfd319b147ab03a936bd429805258425bccca749de71bcb1efaf \
--ignore-preflight-errors=cri
在主服务器上安装了仪表板:
$ kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/master/src/deploy/recommended/kubernetes-dashboard.yaml
secret "kubernetes-dashboard-certs" created
serviceaccount "kubernetes-dashboard" created
role "kubernetes-dashboard-minimal" created
rolebinding "kubernetes-dashboard-minimal" created
deployment "kubernetes-dashboard" created
service "kubernetes-dashboard" created
启动了代理:
$ kubectl proxy
Starting to serve on 127.0.0.1:8001
使用-L 8001:127.0.0.1:8001打开另一个ssh来打开并打开http://localhost:8001/ui的本地浏览器窗口
重定向到http://localhost:8001/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy/并说:
{
"kind": "Status",
"apiVersion": "v1",
"metadata": {
},
"status": "Failure",
"message": "no endpoints available for service \"https:kubernetes- dashboard:\"",
"reason": "ServiceUnavailable",
"code": 503
}
检查豆荚......
$ sudo kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
default guids-74487d79cf-zsj8q 1/1 Running 0 4h
kube-system etcd-jenkins-kube-master 1/1 Running 1 21h
kube-system kube-apiserver-jenkins-kube-master 1/1 Running 1 21h
kube-system kube-controller-manager-jenkins-kube-master 1/1 Running 2 21h
kube-system kube-dns-6f4fd4bdf-7pr9q 3/3 Running 0 1d
kube-system kube-flannel-ds-pvk8m 1/1 Running 0 4h
kube-system kube-flannel-ds-q4fsl 1/1 Running 0 4h
kube-system kube-flannel-ds-qhxn6 1/1 Running 0 21h
kube-system kube-flannel-ds-tkspz 1/1 Running 0 4h
kube-system kube-flannel-ds-vgqsb 1/1 Running 0 4h
kube-system kube-proxy-7np9b 1/1 Running 0 4h
kube-system kube-proxy-9lx8h 1/1 Running 1 1d
kube-system kube-proxy-f46d8 1/1 Running 0 4h
kube-system kube-proxy-fdtx9 1/1 Running 0 4h
kube-system kube-proxy-kmnjf 1/1 Running 0 4h
kube-system kube-scheduler-jenkins-kube-master 1/1 Running 1 21h
kube-system kubernetes-dashboard-5bd6f767c7-xf42n 0/1 CrashLoopBackOff 53 4h
检查日志......
$ sudo kubectl logs kubernetes-dashboard-5bd6f767c7-xf42n --namespace=kube-system
2018/03/20 17:56:25 Starting overwatch
2018/03/20 17:56:25 Using in-cluster config to connect to apiserver
2018/03/20 17:56:25 Using service account token for csrf signing
2018/03/20 17:56:25 No request provided. Skipping authorization
2018/03/20 17:56:55 Error while initializing connection to Kubernetes apiserver.
This most likely means that the cluster is misconfigured (e.g., it has invalid
apiserver certificates or service accounts configuration) or the
--apiserver-host param points to a server that does not exist.
Reason: Get https://10.96.0.1:443/version: dial tcp 10.96.0.1:443: i/o timeout
Refer to our FAQ and wiki pages for more information: https://github.com/kubernetes/dashboard/wiki/FAQ
我觉得这个引用10.96.0.1比较奇怪。我在网络上没有那个我知道的地方。
我将sudo kubectl describe pod --namespace=kube-system
的输出放在pastebin上:
https://pastebin.com/cPppPkRw
提前感谢任何指示。
-Steve Maring
佛罗里达州奥兰多
答案 0 :(得分:1)
--service-cluster-ip-range=10.96.0.0/12
你的pastebin的第76行显示了服务CIDR,它与kubernetes对世界的看法相符:服务CIDR中的.1
始终是kubernetes(IIRC kube-dns
获得相当低的IP任务也是如此,但我无法回想起它是否总是像kubernetes一样被修复了)
您希望更改服务和Pod CIDR以适应法兰绒创建的10.244.0.0/16 subnet作为部署该yaml的副作用,或更改其ConfigMap
(错误)现在网络已被推入etcd
)以与指定给apiserver
的服务和Pod CIDR保持一致,这是您的危险。