滚动每日独特的计数

时间:2013-06-14 01:23:33

标签: sql oracle aggregate-functions

我们有一个包含以下列的表:

SESSION_ID      USER_ID          CONNECT_TS
--------------  ---------------  ---------------
1               99               2013-01-01 2:23:33
2               101              2013-01-01 2:23:55
3               104              2013-01-01 2:24:41
4               101              2013-01-01 2:24:43
5               233              2013-01-01 2:25:01

我们需要为每天计算一组“活跃用户”,这些用户被定义为在过去45天内使用过该应用程序的用户。以下是我们提出的建议,但我觉得必须有更好的方法:

select trunc(a.connect_ts)
, count(distinct a.user_id) daily_users
, count(distinct b.user_id) active_users
from sessions a
  join sessions b
    on (b.connect_ts between trunc(a.connect_ts) - 45 and trunc(a.connect_ts))
where a.connect_ts between '01-jan-13' and '12-jun-13'
  and b.connect_ts between '01-nov-12' and '12-jun-13'
group by trunc(a.connect_ts);

我们查看了窗口函数,但看起来并不支持不同的计数。我们还考虑首先将聚合加载到临时表中,但同样,不同的计数将其排除在外。有没有更好的方法来做到这一点?

2 个答案:

答案 0 :(得分:0)

要做的第一件事是生成一个你感兴趣的日子列表:

select (trunc(sysdate, 'yyyy') -1) + level as ts_day
from dual
connect by level <= to_number( to_char(sysdate, 'DDD' ) )

这将生成一个从今年01-JAN到今天的日期表。将您的表加入此子查询。使用交叉连接可能效率不高,具体取决于您在该范围内拥有的数据量。因此,请将此视为概念验证并根据需要进行调整。

with days as
 ( select (trunc(sysdate, 'yyyy') -1) + level as ts_day
   from dual
   connect by level <= to_number( to_char(sysdate, 'DDD' ) ) )
select days.ts_day
       , sum ( case when trunc(connect_ts) = ts_day then 1 else 0 end ) as daily_users
       , sum ( case when trunc(connect_ts) between ts_day - 45 and ts_day then 1 else 0 end ) as active_users
from days
     cross join sessions  
where connect_ts between trunc(sysdate, 'yyyy') - 45 and sysdate
group by ts_day
order by ts_day
/

答案 1 :(得分:0)

如果您的Oracle版本支持WITH语句,这可能会对您有所帮助:

with sel as (
select trunc(a.connect_ts) as logon_day
, count(distinct user_id) as logon_count
from sessions
group by trunc(connect_ts)
)
select s1.logon_day
, s1.logon_count as daily_users
, (select sum(logon_count) from sel where logon_day between s1.logon_day - 45 and s1.logon_day) as active_users
from sel s1

否则你将不得不这样写(执行得慢得多......):

select sel.logon_day
, sel.logon_count as daily_users
, (select count(distinct user_id) as logon_count
      from t_ad_session
      where trunc(connect_ts) between sel.logon_day - 45 and sel.logon_day) as active_users
from (select trunc(connect_ts) as logon_day, count(distinct user_id) as logon_count
      from t_ad_session
      group by trunc(connect_ts)) sel