Why is GKE compute percentage so high

时间:2017-12-18 08:19:45

标签: google-kubernetes-engine

I have a GKE cluster with 5 f1-micro nodes. It's running a very simple, 3-service, nodejs based app, seeing very little traffic.

I recently configured StackDriver and I noticed this weird graph:

enter image description here

Notice that all metrics are going up. I suspect this is a bug, the metrics are somehow cumulative, but they should be a gauge.

kube-ui doesn't show this outrageous CPU usage. I SSHed to the boxes and couldn't find any outstanding problems using <ListView x:Name="listView" IsGroupingEnabled="True" HasUnevenRows="True"> .

Moreover this graph, which should show the same thing, is completely different:

enter image description here

A couple of questions:

  1. first, has anyone else experienced this?
  2. why is this happening? Is there any way I can debug this?
  3. how can I fix it?

Thank you

Edit

The CPU usage has stabilised, but it's still at ridiculously high levels. It appears to be the bug JMD described below. Here's how the graph looks now for the last month:

enter image description here

1 个答案:

答案 0 :(得分:0)

出现高CPU使用率的误报问题。你所经历的应与之相关。

这似乎正在发生,因为短期实例的数据在它们启动时会被报告,但一旦它们消失就不会报告任何值。
它似乎会创建违反警报策略阈值的数据。一旦策略的持续时间过去,如果该持续时间内的所有数据都高于阈值,则策略将触发。

在实例报告低于阈值的值或7天后没有报告数据的情况下,策略应该关闭。