与昨天相比,获得新用户

时间:2016-09-28 14:47:12

标签: sql oracle

我有一个用户流量表,我需要获得与前一天相比新用户的收益/损失。只是想知道是否有更好的方法来做到这一点,而不是下面的解决方案。

架构: -

Table Strcutre: Session_ID, session_day, user_id, product_id

我尝试了什么?

SELECT session_day,
       session_count,
       user_count - LAG( user_count, 1 ) OVER ( ORDER BY session_day ) AS gain_loss_users
  FROM   
    (
        SELECT session_day,
               COUNT( session_id ) AS session_count,
               COUNT( user_id ) user_count
          FROM user_traffic
         GROUP BY session_day
     ) X ; 

2 个答案:

答案 0 :(得分:1)

我试图解决“新”和“回归”问题。这是我的尝试:

    select session_day, 
       COUNT( distinct user_id ) AS user_cnt,
       count(distinct user_id) - lag(count(distinct user_id)) 
                                     over (order by session_day) gain,
       count(newu) AS  newu, count(returnu) AS returnu
  from (
          select session_id,
                 session_day,
                 user_id, 
                 CASE WHEN
                 count(*) over ( partition by user_id ORDER BY session_day,session_id ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW )
                           = 1 
                      THEN 1
                  END 
                  AS newu,
                 CASE WHEN 
                 lag( session_day,1 ) over ( partition by user_id ORDER BY session_day,session_id ) 
                           <> 
                           lag( session_day,1 ) over ( order by session_day,session_id ) 
                      THEN 1
                 END  AS returnu    
            from user_traffic u
        )
  group by session_day
  order by session_day;

测试数据和输出:

create table user_traffic (session_id number(6), session_day date, 
                           user_id number(6), product_id number(6));

insert into user_traffic values (  1, date '2016-09-07', 101, 1);
insert into user_traffic values (  2, date '2016-09-07', 101, 4);
insert into user_traffic values (  3, date '2016-09-07', 102, 1);
insert into user_traffic values (  4, date '2016-09-08', 101, 2);
insert into user_traffic values (  5, date '2016-09-08', 101, 4);
insert into user_traffic values (  6, date '2016-09-09', 102, 1);
insert into user_traffic values (  7, date '2016-09-10', 102, 1);
insert into user_traffic values (  8, date '2016-09-10', 103, 3);

SESSION_DAY        CNT       GAIN        NEW    RETURNS
----------- ---------- ---------- ---------- ----------
2016-09-07           2                     2          0   -- 101 & 102 are new
2016-09-08           1         -1          0          0
2016-09-09           1          0          0          1   -- 102 returned
2016-09-10           2          1          1          0   -- 103 is new

答案 1 :(得分:0)

没有更好的方式,但有一种更简洁的方式。您可以将窗口函数与聚合函数混合使用:

   SELECT session_day,
          COUNT(session_id ) as session_count,
          COUNT(DISTINCT user_id ) as user_count,
          (COUNT(DISTINCT user_id ) - 
           LAG(COUNT(DISTINCT user_id )) OVER (ORDER BY session_day)
          ) as gain_loss_users
      FROM user_traffic
     GROUP BY session_day;

我认为您需要COUNT(DISTINCT),因为(1)用户可能在同一天有多个会话,(2)两个计数相同(如果user_idsession_id绝不是NULL)。