根据日期+计数将数据集拆分为组[MSSQL 2012]

时间:2016-11-02 16:27:44

标签: sql-server sql-server-2012 grouping

我在SQL Server 2012中有一个数据集需要遵循以下原则:

  

A"座位"在第一个结果的6个月内允许8个结果。如果在第一个结果发布之日起的6个月内有超过8个结果,那么新的"座位"给出。

     

如果结果是在第一个结果发布之日起的6个月内创建的,那么新的"座位"给出了

所以我有以下数据:

User    DateCreated
----    -------------
User1   2015-01-01 16:05:00
User1   2015-01-02 16:05:00
User1   2015-01-03 16:05:00
User1   2015-01-04 16:05:00
User1   2015-01-05 16:05:00
User1   2015-01-06 16:05:00
User1   2015-01-07 16:05:00
User1   2015-01-08 16:05:00
User1   2015-01-09 16:05:00
User1   2015-01-10 13:25:00
User1   2015-01-11 13:25:00
User1   2015-01-12 13:25:00
User1   2015-09-01 13:00:00 
User1   2016-04-01 13:00:00
User2   2015-01-01 13:25:00 
User2   2015-01-02 13:25:00 
User2   2015-09-01 13:25:00 
User2   2016-01-01 13:25:00 
User2   2016-05-01 13:25:00 
User3   2015-01-01 16:05:00 
User3   2015-01-02 16:05:00     
User3   2015-01-03 16:05:00     

根据上述规则,它们可以拆分为以下"组"

User    DateCreated             Group
----    -------------           -----
User1   2015-01-01 16:05:00     1
User1   2015-01-02 16:05:00     1
User1   2015-01-03 16:05:00     1
User1   2015-01-04 16:05:00     1
User1   2015-01-05 16:05:00     1
User1   2015-01-06 16:05:00     1
User1   2015-01-07 16:05:00     1
User1   2015-01-08 16:05:00     1 /*8 results within 6 months from first row*/

User1   2015-01-09 16:05:00     2
User1   2015-01-10 13:25:00     2
User1   2015-01-11 13:25:00     2
User1   2015-01-12 13:25:00     2

User1   2015-09-01 13:00:00     3 /*created 6+ months after previous row*/

User1   2016-04-01 13:00:00     4 /*created 6+ months after previous row*/
----
User2   2015-01-01 13:25:00     1
User2   2015-01-02 13:25:00     1

User2   2015-09-01 13:25:00     2 /*created 6+ months after previous row*/
User2   2016-01-01 13:25:00     2 

User2   2016-05-01 13:25:00     3 /*created 6+ months after previous row*/
----
User3   2015-01-01 16:05:00     1
User3   2015-01-02 16:05:00     1
User3   2015-01-03 16:05:00     1 /*3 results within 6 months from first row, within the 8 result cut-off */

然后可能最终成为

User        Seats
----        -----
User1       4
User2       3
User3       1

如果有的话,如何在SQL查询中实现?

-

好的所以评论https://stackoverflow.com/a/40385655/1283391就在那里,我修改了上面的预期输出以解释差异,因为我的解释不正确。

我认为我不需要 SUM ,更多运行总计

1 个答案:

答案 0 :(得分:1)

这是一种方法

;WITH cte
     AS (SELECT *,
         ( ( Row_number()OVER(partition BY [User] 
             ORDER BY [DateCreated]) - 1 ) / 8 ) + 1 AS rn, -- To group 8 records per user
                Lag([DateCreated])OVER(partition BY [User] ORDER BY [DateCreated])                     AS PREV_DATE
         FROM   Yourtable),
     INTR
     AS (SELECT *,
                Sum(Datediff(mm, Isnull(PREV_DATE, [DateCreated]), [DateCreated]))
                  OVER(
                    partition BY [User]
                    ORDER BY [DateCreated]) AS GRP -- To group the user based on Date difference 
         FROM   cte)
SELECT [User],[DateCreated],
       Dense_rank()
         OVER(
           PARTITION BY [User]
           ORDER BY rn, GRP) AS Groups
FROM   INTR 

示例数据

CREATE TABLE Yourtable
    ([User] varchar(5), [DateCreated] datetime)
;

INSERT INTO Yourtable
    ([User], [DateCreated])
VALUES
    ('User1', '2015-01-01 16:05:00'),
    ('User1', '2015-01-02 16:05:00'),
    ('User1', '2015-01-03 16:05:00'),
    ('User1', '2015-01-04 16:05:00'),
    ('User1', '2015-01-05 16:05:00'),
    ('User1', '2015-01-06 16:05:00'),
    ('User1', '2015-01-07 16:05:00'),
    ('User1', '2015-01-08 16:05:00'),
    ('User1', '2015-01-09 16:05:00'),
    ('User1', '2015-01-10 13:25:00'),
    ('User1', '2015-01-11 13:25:00'),
    ('User1', '2015-01-12 13:25:00'),
    ('User2', '2015-01-01 13:25:00'),
    ('User2', '2015-01-02 13:25:00'),
    ('User2', '2015-09-01 13:25:00'),
    ('User2', '2016-05-01 13:25:00')
;

结果

+-------+-------------------------+--------+
| User  |       DateCreated       | Groups |
+-------+-------------------------+--------+
| User1 | 2015-01-01 16:05:00.000 |      1 |
| User1 | 2015-01-02 16:05:00.000 |      1 |
| User1 | 2015-01-03 16:05:00.000 |      1 |
| User1 | 2015-01-04 16:05:00.000 |      1 |
| User1 | 2015-01-05 16:05:00.000 |      1 |
| User1 | 2015-01-06 16:05:00.000 |      1 |
| User1 | 2015-01-07 16:05:00.000 |      1 |
| User1 | 2015-01-08 16:05:00.000 |      1 |
| User1 | 2015-01-09 16:05:00.000 |      2 |
| User1 | 2015-01-10 13:25:00.000 |      2 |
| User1 | 2015-01-11 13:25:00.000 |      2 |
| User1 | 2015-01-12 13:25:00.000 |      2 |
| User2 | 2015-01-01 13:25:00.000 |      1 |
| User2 | 2015-01-02 13:25:00.000 |      1 |
| User2 | 2015-09-01 13:25:00.000 |      2 |
| User2 | 2016-05-01 13:25:00.000 |      3 |
+-------+-------------------------+--------+

查找最终结果

Select [User],Max(Groups) as Seats
From INTR 
Group by [User]