如果FailedRequests的数量大于要接收警报的99%,则我试图在15分钟内建立警报。我已经编写了一个KQL查询,但不幸的是,即使没有真正的问题发生,也就是没有真正获得大于99%的条件,它还是会触发。以下是该查询,我确定我在此方面犯了一些愚蠢的错误吗?
任何修复上述查询的帮助,因此只有在关键时刻(即收到的所有请求均失败)时,它才能真正给出结果。
requests
| where cloud_RoleName == 'ABCDEF_cloudRName' and resultCode != '404'
| summarize FailedPercent=((countif(success == false))/count() by timestamp, cloud_RoleName, appName)*100
| where FailedPercent > 99
| project RelatedCI='XYZZZ',AlarmTime=timestamp,Category="Cloud-Azure-Monitor",SubCategory="Application",Object=appName ,"Value of Metric","Percentage Failed Requests"," is ", FailedPercent
答案 0 :(得分:0)
Here是当失败百分比大于xx%时发送警报的类似问题。
我只是编写查询,如果不满足您的需要,请随时对其进行修改:
requests
| where resultCode != "404" and success == "False"
| summarize exceptionsCount =count()
| extend a = "a"
| join
(
requests
| where resultCode != "404"
| summarize requestsCount =count()
| extend a = "a"
)
on a
| project isFail = 1.0 * exceptionsCount / requestsCount > 0.99 //check if the failed percentage is greater than 99%.
| project rr=iff(isFail, "Fail","Pass" )
| where rr=="Fail"
查询代码准备就绪后,您可以按照上面issue中的步骤创建基于查询的警报。