这是自动定标器的日志:
0922 17:08:33.857348 1 auto_scaling_groups.go:102] Updating ASG terraform-eks-demo20190922161659090500000007--terraform-eks-demo20190922161700651000000008
I0922 17:08:33.857380 1 aws_manager.go:152] Refreshed ASG list, next refresh after 2019-09-22 17:08:43.857375311 +0000 UTC m=+259.289807511
I0922 17:08:33.857465 1 utils.go:526] No pod using affinity / antiaffinity found in cluster, disabling affinity predicate for this loop
I0922 17:08:33.857482 1 static_autoscaler.go:261] Filtering out schedulables
I0922 17:08:33.857532 1 static_autoscaler.go:271] No schedulable pods
I0922 17:08:33.857545 1 static_autoscaler.go:279] No unschedulable pods
I0922 17:08:33.857557 1 static_autoscaler.go:333] Calculating unneeded nodes
I0922 17:08:33.857601 1 scale_down.go:376] Scale-down calculation: ignoring 2 nodes unremovable in the last 5m0s
I0922 17:08:33.857621 1 scale_down.go:407] Node ip-10-0-1-135.us-west-2.compute.internal - utilization 0.055000
I0922 17:08:33.857688 1 static_autoscaler.go:349] ip-10-0-1-135.us-west-2.compute.internal is unneeded since 2019-09-22 17:05:07.299351571 +0000 UTC m=+42.731783882 duration 3m26.405144434s
I0922 17:08:33.857703 1 static_autoscaler.go:360] Scale down status: unneededOnly=true lastScaleUpTime=2019-09-22 17:04:42.29864432 +0000 UTC m=+17.731076395 lastScaleDownDeleteTime=2019-09-22 17:04:42.298645611 +0000 UTC m=+17.731077680 lastScaleDownFailTime=2019-09-22 17:04:42.298646962 +0000 UTC m=+17.731079033 scaleDownForbidden=false isDeleteInProgress=false
I0922 17:08:33.857688 1 static_autoscaler.go:349] ip-10-0-1-135.us-west-2.compute.internal is unneeded since 2019-09-22 17:05:07.299351571 +0000 UTC m=+42.731783882 duration 3m26.405144434s
如果不需要,那么下一步是什么?还等什么呢?
我耗尽了一个节点:
kubectl get nodes -o=wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
ip-10-0-0-118.us-west-2.compute.internal Ready <none> 46m v1.13.10-eks-d6460e 10.0.0.118 52.40.115.132 Amazon Linux 2 4.14.138-114.102.amzn2.x86_64 docker://18.6.1
ip-10-0-0-211.us-west-2.compute.internal Ready <none> 44m v1.13.10-eks-d6460e 10.0.0.211 35.166.57.203 Amazon Linux 2 4.14.138-114.102.amzn2.x86_64 docker://18.6.1
ip-10-0-1-135.us-west-2.compute.internal Ready,SchedulingDisabled <none> 46m v1.13.10-eks-d6460e 10.0.1.135 18.237.253.134 Amazon Linux 2 4.14.138-114.102.amzn2.x86_64 docker://18.6.1
为什么不终止实例?
这些是我正在使用的参数:
- ./cluster-autoscaler
- --cloud-provider=aws
- --namespace=default
- --scan-interval=25s
- --scale-down-unneeded-time=30s
- --nodes=1:20:terraform-eks-demo20190922161659090500000007--terraform-eks-demo20190922161700651000000008
- --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/example-job-runner
- --logtostderr=true
- --stderrthreshold=info
- --v=4
答案 0 :(得分:0)
您有以下任何一项吗?
您对CA的配置/启动选项对我来说看起来不错。
我只能想象对于在该节点上运行的特定Pod来说,这可能是一件好事。也许在未按比例缩小的所列节点上运行的kube系统吊舱中运行,然后检查上面的列表。
这两个页面部分中有一些很好的项目需要检查,这可能导致CA无法按比例缩小节点。
low utilization nodes but not scaling down, why? what types of pods can prevent CA from removing a node?