根据组,排名和类别列中的逻辑来创建新的增量分组列

时间:2018-01-08 19:02:27

标签: sql-server sql-server-2008

我试图将总数加在一起超出基本的“分组依据”或“案例”陈述。

以下是一个示例数据集:

Amt Cust_id Ranking PlanType
10  1       1       Term
6   1       2       Variable
8   1       3       Variable
7   1       4       Variable
12  1       5       Term
6   1       6       Variable
10  1       7       Variable

目标是返回计划类型为“可变”的最大总和 排名数字彼此相邻。

因此,示例的答案将是行2-4的总和,它返回21。 答案不是所有变量计划类型的总和,因为第5行是将其分开的“术语”。

所以我想以下面的数据集结束处理多组客户:

Amt Cust_ID
21  1
30  2 
45  3

这是我被困的地方,它回答错误答案:

Create Table #tb (Amt INT, Cust_id TINYINT, Ranking INT, PlanType 
VARCHAR(10))
INSERT INTO #tb
VALUES (10,1,1,'Term'),
    (6,1,2,'Variable'),
    (8,1,3,'Variable'),
    (7,1,4,'Variable'),
    (12,1,5,'Term'),
    (6,1,6,'Variable'),
    (10,1,7,'Variable'),

    (10,2,1,'Term'),
    (6,2,2,'Variable'),
    (7,2,4,'Variable'),
    (12,2,5,'Term'),
    (6,2,6,'Variable'),
    (50,2,7,'Variable')

select  
    ( SELECT SUM(Amt) FROM #tb as t2
      WHERE t2.Cust_ID=t1.Cust_ID AND t2.Ranking<=t1.Ranking AND 
      t2.PlanType='Variable') RollingAmt

,Cust_ID, Ranking, Amt, PlanType
from #tb as t1
order by Cust_ID, Ranking

查询运行按“排名”排序的滚动总和,其中PlanType ='Variable'。不幸的是,它将所有“变量”的总和运行在一起。我需要它不要那样做。 如果它遇到PlanType“Term”,则需要从每个组中的总和开始。

1 个答案:

答案 0 :(得分:0)

为了做到这一点,你需要使用gap-and-islands技术来生成&#34;组ID&#34;基于相同PlanType的连续运行,您可以根据新的组ID进行求和。

试试这个:

DECLARE @data TABLE (Amt INT, Cust_id TINYINT, Ranking INT, PlanType VARCHAR(10))
INSERT INTO @data
VALUES (10,1,1,'Term'),
        (6,1,2,'Variable'),
        (8,1,3,'Variable'),
        (7,1,4,'Variable'),
        (12,1,5,'Term'),
        (6,1,6,'Variable'),
        (10,1,7,'Variable'),

        (10,2,1,'Term'),
        (6,2,2,'Variable'),
        (7,2,4,'Variable'),
        (12,2,5,'Term'),
        (6,2,6,'Variable'),
        (50,2,7,'Variable')

;WITH X AS
(
    SELECT *, 
           ROW_NUMBER() OVER(PARTITION BY Cust_id,PlanType ORDER BY Ranking)
            - ROW_NUMBER() OVER(PARTITION BY Cust_id ORDER BY Ranking) groupID /* Assign a groupID to consecutive runs of PlanTypes by Cust_id */
    FROM @data
), Y AS
    (
        SELECT *, SUM(Amt) OVER(PARTITION BY Cust_id,groupID) AS AmtSum /* Sum Amt by Cust/groupID */
        FROM X
        WHERE PlanType='Variable'
    ), Z AS
        (
            SELECT *, ROW_NUMBER() OVER(PARTITION BY Cust_id ORDER BY AmtSum DESC) AS RN /* Assign a row number (1) to highest AmtSum by Cust */
            FROM Y
        )

SELECT AmtSum, Cust_id
FROM Z
WHERE RN=1 /* Only select RN=1 to get highest value by cust_id/groupId */

如果您对这一切的运作方式感到好奇,可以对最后的SELECT进行评论,然后对SELECT * FROM X进行评论,然后对SELECT * FROM Y等进行评论,看看每个步骤的作用是什么;但只有一个SELECT可以遵循整个CTE结构。