计算从开始和结束时间跨度得出的每分钟会话数

时间:2016-04-29 01:36:23

标签: sql postgresql session amazon-redshift

我有一个包含用户活动记录的表,其中包含由开始和结束时间指示的范围。我正在寻找前一天每单位时间内在系统中活动的用户数。

最长会话长度为一小时,并且它们不跨越小时边界。会话可以结束,新会议可以在同一分钟开始。

以下是查询的精简版本:

with minutes AS (
    -- ignore this...it generates a day's worth of timestamps for each minute
    -- it's hairy but is what I'm stuck with on redshift
    select (dateadd(minute, -row_number() over (order by true), sysdate::date)) as minute
        from seed_table limit 1440
),
sessions as (
    select sid, ts_start, ts_end
    from user_sessions s
    where ts_end >= sysdate::date-'1 day'::interval 
        and ts_start < sysdate::date
)
select m.minute, count(distinct(s.sid))
from minutes m
left join sessions s on s.ts_end >= m.minute and s.ts_start < m.minute+'1 min'::interval
group by 1

我正试图避免那种令人讨厌的左连接:

->  XN Nested Loop Left Join DS_BCAST_INNER  (cost=6913826151.95..4727012848741.55 rows=410434560 width=166)
    Join Filter: (("inner".ts_start < ("outer"."minute" + '00:01:00'::interval)) AND ("inner".ts_end >= "outer"."minute"))

根据Gordon Linoff的回答,这些对我来说几乎是有用的。当用户在一分钟内的会话转换时,它会被计算在内。虽然看似正确的方向。由于同样的原因,原始查询可能会超过计数,但是获得一分钟不同会话ID计数的机会可以解决这个问题。

select minute, sum(count) over (order by minute rows unbounded preceding) as users
from (
    select minute, sum(count) as count
    from (
        (
            select date_trunc('minute', ts_start) as minute, count(*) as count
            from sessions
            group by 1
        ) union all (
            select date_trunc('minute', ts_end) as minute, - count(*) as count
            from sessions
            group by 1
        )
    ) s1
    group by minute
) s2
order by minute;

为了比较,以下是一小时数据的时间结果:

  1. 原始查询时间:81301.345 ms
  2. 总结查询时间:36242.342 ms

1 个答案:

答案 0 :(得分:2)

通过计算每分钟的开始和停止次数,然后计算累积总和,可以更快地完成这项工作。结果是这样的:

select minute, sum(cnt) over (order by minute)
from ((select date_trunc('minute', ts_start) as minute, count(*) as cnt
       from sessions
       group by 1
      ) union all
      (select date_trunc('minute', ts_end), - count(*)
       from sessions
       group by 1
      )
     ) s
group by minute
order by minute;