SQL(Vertica) - 计算过去7天内至少x天返回应用的用户数

时间:2017-05-26 05:45:22

标签: sql date vertica vsql

假设我有table喜欢:

uid  day_used_app   
---  -------------
1    2012-04-28      
1    2012-04-29        
1    2012-04-30        
2    2012-04-29       
2    2012-04-30 
2    2012-05-01       
2    2012-05-21        
2    2012-05-22   

假设我想要在过去7天内(从2012-05-03)至少2天内返回该应用的唯一身份用户数。

以此为例,检索过去7天内至少2天内使用该应用程序的用户数量:

select count(distinct case when num_different_days_on_app >= 2
                           then uid else null end) as users_return_2_or_more_days

    from (
         select uid,
                count(distinct day_used_app) as num_different_days_on_app
             from table
         where day_used_app between current_date() - 7 and current_date()
         group by 1

        )

这给了我:

users_return_2_or_more_days
---------------------------
            2

我的问题是:

如果我希望每天都这样做,以便我的表格如下所示,其中第二个字段等于在日期之前一周内返回2个或更多不同日期的唯一身份用户数量第一场。

        date               users_return_2_or_more_days
      --------             ---------------------------
    2012-04-28                        2
    2012-04-29                        2 
    2012-04-30                        3           
    2012-05-01                        4     
    2012-05-02                        4       
    2012-05-03                        3

2 个答案:

答案 0 :(得分:1)

这会有帮助吗?

WITH
-- your original input, don't use in "real" query ...
input(uid,day_used_app) AS (
          SELECT 1,DATE '2012-04-28'
UNION ALL SELECT 1,DATE '2012-04-29'
UNION ALL SELECT 1,DATE '2012-04-30'
UNION ALL SELECT 2,DATE '2012-04-29'
UNION ALL SELECT 2,DATE '2012-04-30'
UNION ALL SELECT 2,DATE '2012-05-01'
UNION ALL SELECT 2,DATE '2012-05-21'
UNION ALL SELECT 2,DATE '2012-05-22'
)
-- end of input, start "real" query here, replace ',' with 'WITH'
,
one_week_b4 AS (
  SELECT
    uid
  , day_used_app
  , day_used_app -7 AS day_used_1week_b4
  FROM input
)
SELECT
  one_week_b4.uid
, one_week_b4.day_used_app
, count(*) AS users_return_2_or_more_days
FROM one_week_b4
JOIN input
  ON input.day_used_app BETWEEN one_week_b4.day_used_1week_b4 AND one_week_b4.day_used_app
GROUP BY
  one_week_b4.uid
, one_week_b4.day_used_app
HAVING count(*) >= 2
ORDER BY 1;

输出是:

uid|day_used_app|users_return_2_or_more_days
  1|2012-04-29  |                          3
  1|2012-04-30  |                          5
  2|2012-04-29  |                          3
  2|2012-04-30  |                          5
  2|2012-05-01  |                          6
  2|2012-05-22  |                          2

这有助于满足您的需求吗?

Marco the Sane ......

答案 1 :(得分:0)

SELECT DISTINCT
    t1.day_used_app,
    (
        SELECT SUM(CASE WHEN t.num_visits >= 2 THEN 1 ELSE 0 END)
        FROM
        (
            SELECT uid,
                   COUNT(DISTINCT day_used_app) AS num_visits
            FROM table
            WHERE day_used_app BETWEEN t1.day_used_app - 7 AND t1.day_used_app
            GROUP BY uid
        ) t
   ) AS users_return_2_or_more_days
FROM table t1