我正在使用以下查询:
SELECT a.session_id,
a.created_at,
COUNT(DISTINCT a.mongo_id) AS events
FROM table1 a
JOIN table1 b ON a.session_id = b.session_id
GROUP BY a.session_id,
a.created_at
ORDER BY a.session_id,
a.created_at,
COUNT(DISTINCT a.mongo_id) DESC
获得以下结果:
Session1 2018-10-09 14:04:31.0 22
Session1 2018-10-09 14:04:32.0 10
Session1 2018-10-09 14:04:34.0 1
Session1 2018-10-09 14:04:38.0 1
Session1 2018-10-09 14:04:41.0 1
Session1 2018-10-09 14:04:42.0 1
Session1 2018-10-09 14:04:43.0 2
Session1 2018-10-09 14:04:44.0 2
Session1 2018-10-09 14:04:45.0 1
Session1 2018-10-09 14:04:46.0 2
Session1 2018-10-09 14:04:47.0 2
Session1 2018-10-09 14:04:50.0 2
Session1 2018-10-09 14:04:51.0 2
Session1 2018-10-09 14:04:52.0 1
Session1 2018-10-09 14:04:53.0 1
Session1 2018-10-09 14:04:55.0 1
Session1 2018-10-09 14:04:56.0 1
Session1 2018-10-09 14:04:57.0 1
Session1 2018-10-09 14:05:00.0 1
Session1 2018-10-09 14:05:01.0 2
Session1 2018-10-09 14:05:03.0 3
Session1 2018-10-09 14:05:06.0 1
Session1 2018-10-09 14:05:07.0 2
Session1 2018-10-09 14:05:09.0 4
Session1 2018-10-09 14:05:10.0 30
我想对3秒内发生的所有事件进行分组,以得到以下结果:
Session1 2018-10-09 14:04:31.0 33
Session1 2018-10-09 14:04:38.0 2
Session1 2018-10-09 14:04:42.0 6
Session1 2018-10-09 14:04:46.0 4
Session1 2018-10-09 14:04:50.0 6
Session1 2018-10-09 14:04:55.0 3
Session1 2018-10-09 14:05:00.0 6
Session1 2018-10-09 14:05:06.0 7
Session1 2018-10-09 14:05:10.0 30
我想对3秒钟内的所有事件求和,以得到结果列,如上所示。
为了实现这一目标,我使用了以下查询:
WITH t AS
(
SELECT a.session_id,
a.created_at,
COUNT(DISTINCT a.mongo_id) AS events
FROM table1 a
JOIN table1 b ON a.session_id = b.session_id
GROUP BY a.session_id,
a.created_at
ORDER BY a.session_id,
a.created_at,
COUNT(DISTINCT a.mongo_id) DESC
)
SELECT a.session_id,
TIMESTAMP WITH TIME ZONE 'epoch' +INTERVAL '1 second' *ROUND(EXTRACT('epoch' FROM a.created_at) / 3)*3 AS TIMESTAMP,
SUM(b.events)
FROM t AS a
JOIN t AS b ON a.session_id = b.session_id
GROUP BY a.session_id,
ROUND(EXTRACT('epoch' FROM a.created_at) / 3)
ORDER BY a.session_id,
TIMESTAMP
但这给了我错误的数字。
我该如何实现?任何帮助将不胜感激。
答案 0 :(得分:0)
让我假设您以某种方式获得了指定的结果。然后,您可以使用窗口功能:
with results as (
<whatever>
)
select sessionid, min(created_at), max(created_at), sum(events)
from (select r.*,
sum( (prev_ca < created_at - interval '3 second')::int ) over (partition by sessionid order by created_at rows between unbounded preceding and current row) as grp
from (select r.*,
lag(created_at) over (partition by sessionid order by created_at) as prev_ca
from results r
) r
) r
group by sessionid, grp;
这是通过查看上一个created_at
并确定是否早于3秒来确定组从哪里开始。如果是这样,则开始一个小组。
组开始的累积总和是一个分组标识符,可用于聚合。