或者我做错了什么。我有一项服务可以统计收到的请求。请求具有平台,执行请求的客户端应用程序版本以及其他标签。重新启动服务后(这种情况很少发生在更新中,因此会重置指标)。
因此,我想计算最近时间范围内每个平台的查询百分比,然后执行以下操作:
SELECT SUM("received") as "received"
FROM (
SELECT NON_NEGATIVE_DIFFERENCE(MAX("received")) as "received"
FROM "http_metrics"
WHERE time >= now() - 4h GROUP BY time(1s)
) GROUP BY "platform";
哪个返回:
...
tags: platform=ios
time received
---- --------
1970-01-01T00:00:00Z 581
tags: platform=unknown
time received
---- --------
1970-01-01T00:00:00Z 12310
tags: platform=web
time received
---- --------
1970-01-01T00:00:00Z 6196
并在不分组的情况下进行相同操作:
SELECT SUM("received") as "received"
FROM (
SELECT NON_NEGATIVE_DIFFERENCE(MAX("received")) as "received"
FROM "http_metrics"
WHERE time >= now() - 4h GROUP BY time(1s)
);
哪个返回:
time received
---- --------
1970-01-01T00:00:00Z 8274
这显然是不正确的,因为“未知”平台接收的请求不能超过所有请求。但是我什至不知道哪个是不正确的,全部不正确的,平台不完整的或两者都不是?
如何正确计算请求总数和平台总数?
答案 0 :(得分:0)
好吧,所以问题是,由于我的测量还具有其他标签,例如服务器和应用程序版本,并且每个标签都有单独的计数器,因此它们都变得交错,可以在图形上看到,该图表应该平滑且累积的,但是非常尖刻:
但是当我们添加GROUP BY *
SELECT "received" FROM "http_metrics" WHERE $timeFilter GROUP BY *;
它分为许多独立的平滑序列:
现在,这是有区别的,因此我们可以创建子查询进行汇总。
总计:
SELECT SUM("received") as "received" FROM (
SELECT NON_NEGATIVE_DIFFERENCE(MAX("received")) as "received"
FROM "http_metrics"
WHERE time >= now() - 6h
GROUP BY time(1s), *
) WHERE time >= now() - 6h;
time received
---- --------
2018-07-17T01:07:46.184292033Z 1367
分组:
SELECT SUM("received") as "received" FROM (
SELECT NON_NEGATIVE_DIFFERENCE(MAX("received")) as "received"
FROM "http_metrics"
WHERE time >= now() - 6h
GROUP BY time(1s), *
) WHERE time >= now() - 6h
GROUP BY "platform";
... I'm not going to bore you with response, but sum matches total sum.
所以,我想这个故事应该是这样的:每次您要区分具有某些标签的柜台时,都需要GROUP BY *