Am使用Kusto查询在Azure AppInsights中创建时间表,使用Google's examples of measuring if a webservice is within its error budget之一可视化我们的Web服务何时在其SLO内(以及何时不在):
SLI = The proportion of sufficiently fast requests, as measured from the load balancer metrics. “Sufficiently fast” is defined as < 400 ms.
SLO = 90% of requests < 400 ms
Measured as:
count of http_requests with a duration less than or equal to "0.4" seconds
divided by count of all http_requests
假设在7天的时间内进行10分钟的检查,这是我的代码:
let fastResponseTimeMaxMs = 400.0;
let errorBudgetThresholdForFastResponseTime = 90.0;
//
let startTime = ago(7days);
let endTime = now();
let timeStep = 10m;
//
let timeRange = range InspectionTime from startTime to endTime step timeStep;
timeRange
| extend RespTimeMax_ms = fastResponseTimeMaxMs
| extend ActualCount = toscalar
(
requests
| where timestamp > InspectionTime - timeStep
| where timestamp <= InspectionTime
| where success == "True"
| where duration <= fastResponseTimeMaxMs
| count
)
| extend TotalCount = toscalar
(
requests
| where timestamp > InspectionTime - timeStep
| where timestamp <= InspectionTime
| where success == "True"
| count
)
| extend Percentage = round(todecimal(ActualCount * 100) / todecimal(TotalCount), 2)
| extend ErrorBudgetMinPercent = errorBudgetThresholdForFastResponseTime
| extend InBudget = case(Percentage >= ErrorBudgetMinPercent, 1, 0)
我希望实现的示例查询输出:
InspectionTime [UTC] RespTimeMax_ms ActualCount TotalCount Percentage ErrorBudgetMinPercent InBudget
2019-05-23T21:53:17.894 400 8,098 8,138 99.51 90 1
2019-05-23T22:03:17.894 400 8,197 9,184 89.14 90 0
2019-05-23T22:13:17.894 400 8,002 8,555 93.54 90 1
我得到的错误是:
'where' operator: Failed to resolve scalar expression named 'InspectionTime'
我尝试过todatetime(InspectionTime)
,失败并出现相同的错误。
将InspectionTime
替换为其他类型datetime
的对象可使此代码执行OK,但不包含所需的日期时间值。例如,在上面的代码示例中使用此代码段时,执行OK:
| extend ActualCount = toscalar
(
requests
| where timestamp > startTime // instead of 'InspectionTime - timeStep'
| where timestamp <= endTime // instead of 'InspectionTime'
| where duration <= fastResponseTimeMaxMs
| count
)
在我看来,在InspectionTime
中使用toscalar(...)
是这个问题的症结所在,因为我可以在使用InspectionTime
的类似查询中使用range(...)
请勿将其嵌套在toscalar(...)
中。
注意:我不希望使用时间表request.duration
,因为这不能告诉我超过阈值(400ms)的请求数是否超出了我们的错误预算,到上面定义的公式。
答案 0 :(得分:0)
您的查询无效,因为您无法引用在InspectionTime
中运行的子查询中的toscalar()
列。
如果我正确理解了所需的逻辑,则以下查询可能会起作用或为您提供不同的指导(如果不正确,您可能希望使用datatable
运算符共享样本输入数据集,并指定所需的结果,匹配)
let fastResponseTimeMaxMs = 400.0;
let errorBudgetThresholdForFastResponseTime = 90.0;
//
let startTime = ago(7days);
let endTime = now();
let timeStep = 10m;
//
requests
| where timestamp > startTime and timestamp < endTime
| where success == 'True'
| summarize TotalCount = count(), ActualCount = countif(duration <= fastResponseTimeMaxMs) by bin(timestamp, timeStep)
| extend Percentage = round(todecimal(ActualCount * 100) / todecimal(TotalCount), 2)
| extend ErrorBudgetMinPercent = errorBudgetThresholdForFastResponseTime
| extend InBudget = case(Percentage >= ErrorBudgetMinPercent, 1, 0)