我的群集中的某些(但不是全部)HPA停止更新其CPU利用率时遇到问题。这似乎发生在一些不同的HPA扩展其目标部署之后。
在受影响的HPA上运行kubectl describe hpa
会产生以下事件:
56m <invalid> 453 {horizontal-pod-autoscaler } Warning FailedUpdateStatus Operation cannot be fulfilled on horizontalpodautoscalers.autoscaling "sync-api": the object has been modified; please apply your changes to the latest version and try again
controller-manager
日志显示受影响的HPA在另一个HPA上发生扩展事件后立即出现问题:
I0920 03:50:33.807951 1 horizontal.go:403] Successfully updated status for sync-api
I0920 03:50:33.821044 1 horizontal.go:403] Successfully updated status for monolith
I0920 03:50:34.982382 1 horizontal.go:403] Successfully updated status for aurora
I0920 03:50:35.002736 1 horizontal.go:403] Successfully updated status for greyhound-api
I0920 03:50:35.014838 1 horizontal.go:403] Successfully updated status for sync-api
I0920 03:50:35.035785 1 horizontal.go:403] Successfully updated status for monolith
I0920 03:50:48.873503 1 horizontal.go:403] Successfully updated status for aurora
I0920 03:50:48.949083 1 horizontal.go:403] Successfully updated status for greyhound-api
I0920 03:50:49.005793 1 horizontal.go:403] Successfully updated status for sync-api
I0920 03:50:49.103726 1 horizontal.go:346] Successfull rescale of monolith, old size: 7, new size: 6, reason: All metrics below t
arget
I0920 03:50:49.135993 1 horizontal.go:403] Successfully updated status for monolith
I0920 03:50:49.137008 1 event.go:216] Event(api.ObjectReference{Kind:"Deployment", Namespace:"default", Name:"monolith", UID:"086
bfbee-7ec7-11e6-a6f5-0240c833a143", APIVersion:"extensions", ResourceVersion:"4210077", FieldPath:""}): type: 'Normal' reason: 'Scaling
ReplicaSet' Scaled down replica set monolith-1803096525 to 6
E0920 03:50:49.169382 1 deployment_controller.go:400] Error syncing deployment default/monolith: Deployment.extensions "monolith"
is invalid: status.unavailableReplicas: Invalid value: -1: must be greater than or equal to 0
I0920 03:50:49.172986 1 replica_set.go:463] Too many "default"/"monolith-1803096525" replicas, need 6, deleting 1
E0920 03:50:49.222184 1 deployment_controller.go:400] Error syncing deployment default/monolith: Deployment.extensions "monolith" is invalid: status.unavailableReplicas: Invalid value: -1: must be greater than or equal to 0
I0920 03:50:50.573273 1 event.go:216] Event(api.ObjectReference{Kind:"ReplicaSet", Namespace:"default", Name:"monolith-1803096525", UID:"086e56d0-7ec7-11e6-a6f5-0240c833a143", APIVersion:"extensions", ResourceVersion:"4210080", FieldPath:""}): type: 'Normal' reason: 'SuccessfulDelete' Deleted pod: monolith-1803096525-gaz5x
E0920 03:50:50.634225 1 deployment_controller.go:400] Error syncing deployment default/monolith: Deployment.extensions "monolith" is invalid: status.unavailableReplicas: Invalid value: -1: must be greater than or equal to 0
I0920 03:50:50.666270 1 horizontal.go:403] Successfully updated status for aurora
I0920 03:50:50.955971 1 horizontal.go:403] Successfully updated status for greyhound-api
W0920 03:50:50.980039 1 horizontal.go:99] Failed to reconcile greyhound-api: failed to update status for greyhound-api: Operation cannot be fulfilled on horizontalpodautoscalers.autoscaling "greyhound-api": the object has been modified; please apply your changes to the latest version and try again
I0920 03:50:50.995372 1 horizontal.go:403] Successfully updated status for sync-api
W0920 03:50:51.017321 1 horizontal.go:99] Failed to reconcile sync-api: failed to update status for sync-api: Operation cannot be fulfilled on horizontalpodautoscalers.autoscaling "sync-api": the object has been modified; please apply your changes to the latest version and try again
I0920 03:50:51.032596 1 horizontal.go:403] Successfully updated status for aurora
W0920 03:50:51.084486 1 horizontal.go:99] Failed to reconcile monolith: failed to update status for monolith: Operation cannot be fulfilled on horizontalpodautoscalers.autoscaling "monolith": the object has been modified; please apply your changes to the latest version and try again
使用kubectl edit
手动更新受影响的HPA可以解决问题,但这让我担心HPA在自动扩展方面的可靠性。
感谢任何帮助。我正在运行v1.3.6。
答案 0 :(得分:1)
设置多个指向同一目标部署的HPA是不正确的。当两个不同的HPA指向同一个目标时(如此处所述),系统的行为可能很奇怪。