如何在给定事件日期的情况下使用SQL活动范围生成?

时间:2015-06-12 13:07:53

标签: sql select

我有一个表格格式如下:

| user_name | date       | number_of_visits |
| cat005    | 2015-06-03 |      5           |
| cat005    | 2015-06-08 |      1           |
| dog009    | 2015-06-01 |      7           |
| dog009    | 2015-06-19 |      2           |

因此,对于每个用户,我都有给定日期的访问次数。如果给定用户在给定日期没有访问,则数据库中没有记录(换句话说,如果它等于零,则不save number_of_visits

现在我想使用这个表生成另一个表,其中每个用户都有活动范围。在这里,我们使用以下活跃定义:如果用户在过去10天内至少进行过一次访问,则在某一天被视为“活动”。所以,我想有类似的东西:

| user_name | active_start | active_end |
| cat005    | 2015-03-02   | 2015-03-25 |    
| cat005    | 2015-03-29   | 2015-06-01 |
| dog009    | 2015-04-01   | 2015-06-01 |

请注意,两个示例中的数据不一致。根据使用的定义active_end是独占的(这意味着用户在此日期没有访问)。例如,上表中的第一行表示用户在2015-03-02处于活动状态(他/她至少进行了一次访问)。也可以说在2015-03-01(前一天)这个用户没有活动,这反过来意味着他/她在10天内没有访问。也可以说在2015-03-25使用时没有访问,并且是第11天没有访问(因此,用户被系统“切换”为非活动状态)。

如何使用SQL生成第二个表。

2 个答案:

答案 0 :(得分:0)

这有点棘手。一种方法是确定活动期开始的位置。然后使用以前活动的累积总和数天。该累积和提供了聚合标准。

以下标识活动日期:

select t.*,
       (select t.*,
               (case when date > lag(date) over (partition by user_name order by date) + 10 -- date arithmetic varies by database
                     then 1
                     else 0
                end) as StartPeriodFlag
from table t;

然后累积金额提供分组所需的信息:

with t as (
      select t.*,
             (select t.*,
                     (case when date > lag(date) over (partition by user_name order by date) + 10 -- date arithmetic varies by database
                           then 1
                           else 0
                      end) as StartPeriodFlag
      from table t
     )
select user_name, min(date) as startdate,
       max(date) + 10 as enddate
from (select t.*,
             sum(StartPeriodFlag) over (partition by user_name order by date) as grp
      from t
     ) t
group by user_name, grp;

如上面的评论所述,日期算术因数据库而异。这使用简单的+ 10,但确切的函数可能因数据库而异。

答案 1 :(得分:0)

SQL Fiddle

Oracle 11g R2架构设置

CREATE TABLE ACTIVITY ( user_name, "date", number_of_visits ) AS
          SELECT 'cat005', DATE'2015-06-03', 5 FROM DUAL
UNION ALL SELECT 'cat005', DATE'2015-06-08', 1 FROM DUAL
UNION ALL SELECT 'dog009', DATE'2015-06-01', 7 FROM DUAL
UNION ALL SELECT 'dog009', DATE'2015-06-19', 2 FROM DUAL

查询1

WITH changes AS (
  SELECT user_name,
         "date",
         CASE WHEN "date" <= LAG( "date" ) OVER ( PARTITION BY user_name ORDER BY "date" ) + INTERVAL '10' DAY
              THEN 0
              ELSE 1 END AS change_group
  FROM   ACTIVITY
),
groups AS (
  SELECT user_name,
         "date",
         SUM( change_group ) OVER ( PARTITION BY user_name ORDER BY "date" ) AS grp
  FROM   changes
)
SELECT  user_name,
        MIN( "date" ) AS activity_start,
        MAX( "date" ) + INTERVAL '10' DAY AS activity_end
FROM    groups
GROUP BY
        USER_NAME,
        GRP

<强> Results

| USER_NAME |         ACTIVITY_START |           ACTIVITY_END |
|-----------|------------------------|------------------------|
|    dog009 | June, 19 2015 00:00:00 | June, 29 2015 00:00:00 |
|    dog009 | June, 01 2015 00:00:00 | June, 11 2015 00:00:00 |
|    cat005 | June, 03 2015 00:00:00 | June, 18 2015 00:00:00 |

查询2

WITH changes AS (
  SELECT user_name,
         "date",
         CASE WHEN "date" <= LAG( "date" ) OVER ( PARTITION BY user_name ORDER BY "date" ) + INTERVAL '10' DAY
              THEN null
              ELSE "date" END AS first_date,
         CASE WHEN "date" >= LEAD( "date" ) OVER ( PARTITION BY user_name ORDER BY "date" ) - INTERVAL '10' DAY
              THEN null
              ELSE "date" + INTERVAL '10' DAY END AS last_date
  FROM   ACTIVITY
)
SELECT DISTINCT
       user_name,
       LAST_VALUE(  first_date ) IGNORE NULLS OVER ( PARTITION BY user_name ORDER BY "date" ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW ) AS activity_start,
       FIRST_VALUE( last_date  ) IGNORE NULLS OVER ( PARTITION BY user_name ORDER BY "date" ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING ) AS activity_end
FROM   changes

<强> Results

| USER_NAME |         ACTIVITY_START |           ACTIVITY_END |
|-----------|------------------------|------------------------|
|    cat005 | June, 03 2015 00:00:00 | June, 18 2015 00:00:00 |
|    dog009 | June, 01 2015 00:00:00 | June, 11 2015 00:00:00 |
|    dog009 | June, 19 2015 00:00:00 | June, 29 2015 00:00:00 |