由于GCP Stackdriver为发布/订阅创建了警报,因此我愿意创建一个警报,例如“如果订阅中有XXX个以上的主题在等待消息(未确认)以及该消费者在该消息上的消费率,主题接近0,然后触发警报”。
我已经习惯了Prometheus,在这里我可以简单地依靠标签来加入时间序列,但是我想知道如何使用Stackdriver。
起初,我考虑使用2个条件,即“在匹配资源上满足所有条件时违反策略”,但是我想知道“匹配资源”是否与普罗米修斯的行为相同。
这是我考虑过的警报,但是即使未完全满足这两个条件,它似乎也会触发:
{
"combiner": "AND_WITH_MATCHING_RESOURCE",
"conditions": [
{
"conditionThreshold": {
"aggregations": [
{
"alignmentPeriod": "60s",
"crossSeriesReducer": "REDUCE_SUM",
"groupByFields": [
"metadata.system_labels.topic_id",
"resource.label.subscription_id"
],
"perSeriesAligner": "ALIGN_RATE"
}
],
"comparison": "COMPARISON_LT",
"duration": "300s",
"filter": "metric.type=\"pubsub.googleapis.com/subscription/ack_message_count\" resource.type=\"pubsub_subscription\" resource.label.\"project_id\"=\"pl-service-prod-lm-fr\"",
"thresholdValue": 1,
"trigger": {
"count": 1
}
},
"displayName": "Ack message count"
},
{
"conditionThreshold": {
"aggregations": [
{
"alignmentPeriod": "60s",
"crossSeriesReducer": "REDUCE_SUM",
"groupByFields": [
"metadata.system_labels.topic_id",
"resource.label.subscription_id"
],
"perSeriesAligner": "ALIGN_MEAN"
}
],
"comparison": "COMPARISON_GT",
"duration": "300s",
"filter": "metric.type=\"pubsub.googleapis.com/subscription/num_undelivered_messages\" resource.type=\"pubsub_subscription\" resource.label.\"project_id\"=\"pl-service-prod-lm-fr\"",
"trigger": {
"count": 1
}
},
"displayName": "Unacked messages"
}
],
"displayName": "Pub/Sub is not consumed",
"enabled": true,
"incidentStrategy": {}
}