Kubernetes 1.4.3旋转脚本在AWS上永远保持循环

时间:2016-10-19 11:13:08

标签: amazon-web-services amazon-ec2 kubernetes kubectl

当我运行cluster / kube-up.sh时,它会在等待群集初始化时循环。我试图在us-west-1,us-west-2,eu-west-1中多次启动一个集群,但没有成功。

启动脚本有输出:

$ export KUBE_AWS_ZONE=eu-west-1a
$ export NUM_NODES=3
$ export KUBE_AWS_INSTANCE_PREFIX=test
$ export MASTER_SIZE=m3.medium
$ export NODE_SIZE=t2.medium
$ export KUBERNETES_PROVIDER=aws
$ ./cluster/kube-up.sh
... Starting cluster in eu-west-1a using provider aws
... calling verify-prereqs
... calling kube-up
Starting cluster using os distro: jessie
Uploading to Amazon S3
Creating kubernetes-staging-17d502113db4ff6c4fb9c4b42955c586
make_bucket: s3://kubernetes-staging-17d502113db4ff6c4fb9c4b42955c586/
Confirming bucket was created...
+++ Staging server tars to S3 Storage: kubernetes-staging-17d502113db4ff6c4fb9c4b42955c586/devel
upload: ../../tmp/kubernetes.Bj5OaA/s3/bootstrap-script to s3://kubernetes-staging-17d502113db4ff6c4fb9c4b42955c586/devel/bootstrap-script
upload: ../../tmp/kubernetes.Bj5OaA/s3/kubernetes-salt.tar.gz to s3://kubernetes-staging-17d502113db4ff6c4fb9c4b42955c586/devel/kubernetes-salt.tar.gz
upload: ../../tmp/kubernetes.Bj5OaA/s3/kubernetes-server-linux-amd64.tar.gz to s3://kubernetes-staging-17d502113db4ff6c4fb9c4b42955c586/devel/kubernetes-server-linux-amd64.tar.gz

Uploaded server tars:
  SERVER_BINARY_TAR_URL: https://s3.amazonaws.com/kubernetes-staging-17d502113db4ff6c4fb9c4b42955c586/devel/kubernetes-server-linux-amd64.tar.gz
  SALT_TAR_URL: https://s3.amazonaws.com/kubernetes-staging-17d502113db4ff6c4fb9c4b42955c586/devel/kubernetes-salt.tar.gz
  BOOTSTRAP_SCRIPT_URL: https://s3.amazonaws.com/kubernetes-staging-17d502113db4ff6c4fb9c4b42955c586/devel/bootstrap-script
INSTANCEPROFILE arn:aws:iam::333659885792:instance-profile/kubernetes-master    2016-07-28T10:52:30Z    AIPAJ2ESPVLTF7USVISQI   kubernetes-master   /
ROLES   arn:aws:iam::333659885792:role/kubernetes-master    2016-07-28T10:52:29Z    /   AROAJGF5WAAV5OFWYKBHW   kubernetes-master
ASSUMEROLEPOLICYDOCUMENT    2012-10-17
STATEMENT   sts:AssumeRole  Allow
PRINCIPAL   ec2.amazonaws.com
INSTANCEPROFILE arn:aws:iam::333659885792:instance-profile/kubernetes-minion    2016-08-04T08:41:10Z    AIPAJYJLZTINGNI4RFBLY   kubernetes-minion   /
ROLES   arn:aws:iam::333659885792:role/kubernetes-minion    2016-08-04T08:41:10Z    /   AROAIWUBQVYHYHTSSEH6C   kubernetes-minion
ASSUMEROLEPOLICYDOCUMENT    2012-10-17
STATEMENT   sts:AssumeRole  Allow
PRINCIPAL   ec2.amazonaws.com
Generating public/private rsa key pair.
Your identification has been saved in /root/.ssh/kube_aws_rsa.
Your public key has been saved in /root/.ssh/kube_aws_rsa.pub.
The key fingerprint is:
25:a6:8e:4f:79:2f:75:bf:55:6e:68:c6:7a:35:28:a9 root@ip-172-31-12-179
The key's randomart image is:
+---[RSA 2048]----+
|                 |
|                 |
|        o .      |
|       o o       |
|      . S   . . .|
|     o .  .o.o +o|
|    . + ......=.=|
|     o ..E   +oo |
|      .  .. .... |
+-----------------+
Using SSH key with (AWS) fingerprint: 25:a6:8e:4f:79:2f:75:bf:55:6e:68:c6:7a:35:28:a9
Using VPC vpc-55137031
Adding tag to dopt-c7f311a3: Name=kubernetes-dhcp-option-set
Adding tag to dopt-c7f311a3: KubernetesCluster=test
Using DHCP option set dopt-c7f311a3
Using existing subnet with CIDR 172.20.0.0/24
Using subnet subnet-0a43046e
Creating Internet Gateway.
Using Internet Gateway igw-9dc42bf9
Associating route table.
Creating route table
Adding tag to rtb-da9cb4be: KubernetesCluster=test
Associating route table rtb-da9cb4be to subnet subnet-0a43046e
Adding route to route table rtb-da9cb4be
Using Route Table rtb-da9cb4be
Creating master security group.
Creating security group kubernetes-master-test.
Adding tag to sg-80b072e6: KubernetesCluster=test
Creating minion security group.
Creating security group kubernetes-minion-test.
Adding tag to sg-8cb072ea: KubernetesCluster=test
Using master security group: kubernetes-master-test sg-80b072e6
Using minion security group: kubernetes-minion-test sg-8cb072ea
Creating master disk: size 20GB, type gp2
Adding tag to vol-1f19bb9d: Name=test-master-pd
Adding tag to vol-1f19bb9d: KubernetesCluster=test
Allocated Elastic IP for master: 52.49.10.199
Adding tag to vol-1f19bb9d: kubernetes.io/master-ip=52.49.10.199
Generating certs for alternate-names: IP:52.49.10.199,IP:172.20.0.9,IP:10.0.0.1,DNS:kubernetes,DNS:kubernetes.default,DNS:kubernetes.default.svc,DNS:kubernetes.default.svc.cluster.local,DNS:test-master
Starting Master
Adding tag to i-ac1ca727: Name=test-master
Adding tag to i-ac1ca727: Role=test-master
Adding tag to i-ac1ca727: KubernetesCluster=test
Waiting for master to be ready
Attempt 1 to check for master nodeWaiting for instance i-ac1ca727 to be running (currently pending)
Sleeping for 3 seconds...
Waiting for instance i-ac1ca727 to be running (currently pending)
Sleeping for 3 seconds...
Waiting for instance i-ac1ca727 to be running (currently pending)
Sleeping for 3 seconds...
 [master running]
Attaching IP 52.49.10.199 to instance i-ac1ca727
Attaching persistent data volume (vol-1f19bb9d) to master
2016-10-19T09:41:18.422Z    /dev/sdb    i-ac1ca727  attaching   vol-1f19bb9d
cluster "aws_test" set.
user "aws_test" set.
context "aws_test" set.
switched to context "aws_test".
user "aws_test-basic-auth" set.
Wrote config for aws_test to /root/.kube/config
Creating minion configuration
Creating autoscaling group
 0 minions started; waiting
 0 minions started; waiting
 0 minions started; waiting
 0 minions started; waiting
 3 minions started; ready
Waiting for cluster initialization.

  This will continually check to see if the API for kubernetes is reachable.
  This might loop forever if there was some uncaught error during start
  up.

..............................................................................................................................................................................................^C


$ ./cluster/kubectl.sh version
Client Version: version.Info{Major:"1", Minor:"4", GitVersion:"v1.4.3", GitCommit:"4957b090e9a4f6a68b4a40375408fdc74a212260", GitTreeState:"clean", BuildDate:"2016-10-16T06:36:33Z", GoVersion:"go1.6.3", Compiler:"gc", Platform:"linux/amd64"}

主节点的一些信息:

$ ps -ef | grep kube
root       620   618  0 10:00 pts/0    00:00:00 grep kube
$ cat /var/log/kube-apiserver.log
cat: /var/log/kube-apiserver.log: No such file or directory
$ cat /var/log/cloud-init.log | grep -i error
$ 

1 个答案:

答案 0 :(得分:0)

我有类似的问题,它与DNS服务器IP分配不当有关 - 在我的情况下,saltstack(用于提供主服务的框架)在主旋转期间破解了解析本地主机名。

您的创业参数似乎相当嘈杂,所以我对您的案例没有任何特别的想法。但是也许检查相同的日志以获得一些类似的错误消息会有所帮助 - 所以至少你可以为k8s团队提出一个更快速修复的问题。

我建议仔细检查/ var / log / syslog是否有类似下面的内容,如果你发现了盐栈配置问题的迹象,那么从那个时间点开始向后搜索并搜索看起来不正常的东西。< / p>

Sep 27 13:03:36 ip-172-40-0-9 rc.local[374]: -------------
Sep 27 13:03:36 ip-172-40-0-9 rc.local[374]: Succeeded: 89
Sep 27 13:03:36 ip-172-40-0-9 rc.local[374]: Failed:     6
Sep 27 13:03:36 ip-172-40-0-9 rc.local[374]: -------------
Sep 27 13:03:36 ip-172-40-0-9 rc.local[374]: Total:     95

以下是我提交的完整详细问题:https://github.com/kubernetes/kubernetes/issues/33559 - 也许一些进一步的细节会有所帮助(尽管不太可能)。