累计日期超过日期,丢弃一些值

时间:2017-10-17 23:03:31

标签: sql sql-server sql-server-2012

我有一个带有以下架构的Answers表:

CREATE TABLE Answers
  ([id] int, [analyst_id] int, [date] date);

我必须“累计”一个分析师每个月有多少答案,在最后一个答案之后的3个月之前丢弃任何答案。鉴于以下内容:

INSERT INTO Answers
  ([id], [analyst_id], [date])
VALUES
  (1, 1, '2017/01/01'),
  (2, 1, '2017/02/01'), -- should be discarded
  (3, 1, '2017/03/01'), -- should be discarded
  (4, 1, '2017/05/01'),
  (5, 1, '2017/06/01'), -- should be discarded
  (6, 1, '2017/07/01'), -- should be discarded
  (7, 1, '2017/08/01'),
  (8, 2, '2017/01/01'),
  (9, 2, '2017/04/01'),
  (10, 1, '2018/02/01'),
  (11, 2, '2018/03/01');

预期结果是:

analyst_id | month-year | count
-------------------------------
1          | 01/2017    | 1
1          | 02/2017    | 1
1          | 03/2017    | 1
1          | 04/2017    | 1
1          | 05/2017    | 2
1          | 06/2017    | 2
1          | 07/2017    | 2
1          | 08/2017    | 3
1          | 09/2017    | 3
1          | 10/2017    | 3
1          | 11/2017    | 3
1          | 12/2017    | 3
2          | 01/2017    | 1
2          | 02/2017    | 1
2          | 03/2017    | 1
2          | 04/2017    | 2
2          | 05/2017    | 2
2          | 06/2017    | 2
2          | 07/2017    | 2
2          | 08/2017    | 2
2          | 09/2017    | 2
2          | 10/2017    | 2
2          | 11/2017    | 2
2          | 12/2017    | 2
1          | 01/2018    | 0
1          | 02/2018    | 1
1          | 03/2018    | 1
2          | 01/2018    | 0
2          | 02/2018    | 0
2          | 03/2018    | 1

DBMS是SQL Server 2012。

修改

我用当前的半解决方案写了这个小提琴:http://sqlfiddle.com/#!6/c2e82e/5

每年都需要重置计数。

1 个答案:

答案 0 :(得分:2)

修改

好的,对于更新的问题,你基本上需要制作一个"日期" table(此处为CTE,名为" D"),其中包含Answers表中最小和最大日期之间的所有日期。然后,您基本上可以将结果连接到该结果并使用DENSE_RANK()窗口函数来确定计数。

DECLARE @Answers TABLE (ID INT, Analyst_ID INT, [Date] DATE);
INSERT @Answers (ID, Analyst_ID, [Date])
VALUES
  (1, 1, '2017/01/01'),
  (2, 1, '2017/02/01'),
  (3, 1, '2017/03/01'),
  (4, 1, '2017/05/01'),
  (5, 1, '2017/06/01'),
  (6, 1, '2017/07/01'),
  (7, 1, '2017/08/01'),
  (8, 2, '2017/01/01'),
  (9, 2, '2017/04/01'),
  (10, 1, '2018/02/01'),
  (11, 2, '2018/03/01');

WITH CTE AS
(
    SELECT A.Analyst_ID, [Date] = MIN(A.[Date])
    FROM @Answers AS A
    GROUP BY A.Analyst_ID
    UNION ALL
    SELECT A.Analyst_ID, A.[Date]
    FROM 
    (
        SELECT A.Analyst_ID, A.[Date], RN = ROW_NUMBER() OVER (PARTITION BY A.Analyst_ID ORDER BY A.ID)
        FROM @Answers AS A
        JOIN CTE 
            ON CTE.Analyst_ID = A.Analyst_ID
            AND DATEADD(MONTH, 3, CTE.[Date]) <= A.[Date]
    ) AS A
    WHERE A.RN = 1
),

D AS -- List of dates between minimum and maximum date in table for each analyst ID.
(
    SELECT [Date] = DATEADD(MONTH, RN, (SELECT MIN([Date]) FROM @Answers)),
           A.Analyst_ID
    FROM (SELECT RN = ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) - 1 FROM sys.objects) AS O
    CROSS JOIN (SELECT DISTINCT Analyst_ID FROM @Answers) AS A
    WHERE RN <= (SELECT DATEDIFF(MONTH, MIN([Date]), MAX([Date])) FROM @Answers)
)

SELECT D.Analyst_ID,
       [Month-Year] = FORMAT(D.[Date], 'MM/yyyy'),
       [Count] = CASE WHEN A.[Date] IS NULL THEN 0 ELSE DENSE_RANK() OVER (PARTITION BY D.Analyst_ID, DATEPART(YEAR, A.[Date]) ORDER BY A.[Date]) END
FROM D
OUTER APPLY (SELECT TOP 1 * FROM CTE WHERE CTE.[Date] <= D.[Date] AND DATEDIFF(YEAR, CTE.[Date], D.[Date]) = 0 AND CTE.Analyst_ID = D.Analyst_ID ORDER BY CTE.[Date] DESC) AS A
ORDER BY D.Analyst_ID, D.[Date];