在T-SQL中按状态对客户进行分组

时间:2019-01-27 11:25:07

标签: sql sql-server

我有一张这样的桌子:

customer_id mis_date  status
----------------------------
 10003       2014-01-01  1   
 10003       2014-01-02  1    
 10003       2014-01-03  0   
 10003       2014-01-04  0   
 10003       2014-01-05  0   
 10003       2014-01-06  1   
 10003       2014-01-07  1    
 10003       2014-01-08  1    
 10003       2014-01-09  1    
 10003       2014-01-10  0   
 10003       2014-01-11  0   
 10003       2014-01-12  0   
 10003       2014-01-13  1     
 10003       2014-01-14  1     
 10003       2014-01-15  1     

我正在尝试建立“组”列:

customer_id mis_date status group
----------------------------------
 10003       2014-01-01  1    1
 10003       2014-01-02  1    1
 10003       2014-01-03  0   NULL
 10003       2014-01-04  0   NULL
 10003       2014-01-05  0   NULL
 10003       2014-01-06  1    2
 10003       2014-01-07  1    2
 10003       2014-01-08  1    2
 10003       2014-01-09  1    2
 10003       2014-01-10  0   NULL
 10003       2014-01-11  0   NULL
 10003       2014-01-12  0   NULL
 10003       2014-01-13  1     3
 10003       2014-01-14  1     3
 10003       2014-01-15  1     3

有人知道我如何构建此组列吗?

逻辑:我每天都在跟踪客户状态,并且我想每天都知道该状态在客户历史记录中发生过多少次,但只有当他处于状态时才发生。

例如:first_time-1,second_time-2等等

我踢开了头,找不到解决方法。我想这不是那么复杂。

谢谢!

5 个答案:

答案 0 :(得分:3)

类似的事情应该起作用:

;WITH CTE AS (
   SELECT customer_id, mis_date, status,
          ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY mis_date) - 
          ROW_NUMBER() OVER (PARTITION BY customer_id, status ORDER BY mis_date) AS grp
   FROM mytable
), CTE2 AS (
   SELECT customer_id, status, grp, 
          ROW_NUMBER() OVER (ORDER BY MIN(mis_date)) AS rn
   FROM CTE
   WHERE status = 1
   GROUP BY customer_id, status, grp 
)
SELECT c.customer_id, c.mis_date, c.status, rn       
FROM CTE c
LEFT JOIN CTE2 c2 
   ON c.customer_id = c2.customer_id AND c.status = c2.status AND c.grp = c2.grp
ORDER BY mis_date

CTE标识具有相同status值的连续记录的孤岛。 CTE2枚举status = 1个子组。

答案 1 :(得分:3)

没有CTE的另一种方法类似于以下查询。

SELECT customer_id, mis_date, status, 
       CASE WHEN status = 0 THEN NULL ELSE Dense_rank() OVER (ORDER BY rc) END grp 
FROM   (SELECT *, 
               (SELECT CASE WHEN status = 0 THEN 0 
                         ELSE (SELECT Count(status)  FROM   table1 t2 
                               WHERE  t2.mis_date <= t1.mis_date AND status = 0) END grp)rc 
        FROM   table1 t1) t2 
ORDER  BY mis_date 

输出:

+-------------+-------------------------+--------+------+
| customer_id | mis_date                | status | grp  |
+-------------+-------------------------+--------+------+
| 10003       | 2014-01-01 00:00:00.000 | 1      | 1    |
+-------------+-------------------------+--------+------+
| 10003       | 2014-01-02 00:00:00.000 | 1      | 1    |
+-------------+-------------------------+--------+------+
| 10003       | 2014-01-03 00:00:00.000 | 0      | NULL |
+-------------+-------------------------+--------+------+
| 10003       | 2014-01-04 00:00:00.000 | 0      | NULL |
+-------------+-------------------------+--------+------+
| 10003       | 2014-01-05 00:00:00.000 | 0      | NULL |
+-------------+-------------------------+--------+------+
| 10003       | 2014-01-06 00:00:00.000 | 1      | 2    |
+-------------+-------------------------+--------+------+
| 10003       | 2014-01-07 00:00:00.000 | 1      | 2    |
+-------------+-------------------------+--------+------+
| 10003       | 2014-01-08 00:00:00.000 | 1      | 2    |
+-------------+-------------------------+--------+------+
| 10003       | 2014-01-09 00:00:00.000 | 1      | 2    |
+-------------+-------------------------+--------+------+
| 10003       | 2014-01-10 00:00:00.000 | 0      | NULL |
+-------------+-------------------------+--------+------+
| 10003       | 2014-01-11 00:00:00.000 | 0      | NULL |
+-------------+-------------------------+--------+------+
| 10003       | 2014-01-12 00:00:00.000 | 0      | NULL |
+-------------+-------------------------+--------+------+
| 10003       | 2014-01-13 00:00:00.000 | 1      | 3    |
+-------------+-------------------------+--------+------+
| 10003       | 2014-01-14 00:00:00.000 | 1      | 3    |
+-------------+-------------------------+--------+------+
| 10003       | 2014-01-15 00:00:00.000 | 1      | 3    |
+-------------+-------------------------+--------+------+

Online Demo

答案 2 :(得分:2)

请检查此解决方案。这会根据您的需要添加分组

 with cte0 as 
 ( 
   select [customer_id], [mis_date], [status],
     COALESCE(LAG(status) over (order by mis_date), status) oldstatus
   FRom Table1
 ),
 cte1 as ( 
   select cte0.*, 
     case when status = 0 then 
       null 
     else
       COUNT( case when  status != oldStatus and status = 0 then 1  else null end) OVER (ORDER BY mis_date) 
     end + 1 grp
   from cte0
 )
 select * from cte1
 GO

 customer_id | mis_date            | status | oldstatus |  grp
 ----------: | :------------------ | -----: | --------: | ---:
       10003 | 01/01/2014 00:00:00 |      1 |         1 |    1
       10003 | 02/01/2014 00:00:00 |      1 |         1 |    1
       10003 | 03/01/2014 00:00:00 |      0 |         1 | null
       10003 | 04/01/2014 00:00:00 |      0 |         0 | null
       10003 | 05/01/2014 00:00:00 |      0 |         0 | null
       10003 | 06/01/2014 00:00:00 |      1 |         0 |    2
       10003 | 07/01/2014 00:00:00 |      1 |         1 |    2
       10003 | 08/01/2014 00:00:00 |      1 |         1 |    2
       10003 | 09/01/2014 00:00:00 |      1 |         1 |    2
       10003 | 10/01/2014 00:00:00 |      0 |         1 | null
       10003 | 11/01/2014 00:00:00 |      0 |         0 | null
       10003 | 12/01/2014 00:00:00 |      0 |         0 | null
       10003 | 13/01/2014 00:00:00 |      1 |         0 |    3
       10003 | 14/01/2014 00:00:00 |      1 |         1 |    3
       10003 | 15/01/2014 00:00:00 |      1 |         1 |    3
 

Working Fiddle

答案 3 :(得分:1)

您可以通过非零状态之前的数量来识别每组“ 1”。如果您不关心组号是连续的:

select t.*,
       (case when status = 1
             then sum(case when status = 0 then 1 else 0 end) over (partition by customer_id order by mis_date)
        end) as grp
from t;

没有子查询,联接或聚集。

但是,您可能希望数字是连续的(如您的示例所示)。为此,需要一个子查询:

select t.*,
       (case when status = 1
             then dense_rank() over (partition by customer_id order by grp1)
        end) as grp
from (select t.*,
             sum(case when status = 0 then 1 else 0 end) over (partition by customer_id order by  mis_date) as grp1
      from t
     ) t

答案 4 :(得分:0)

您可以在SQL Server中使用ALTER TABLE语句向表中添加列。 语法

在SQL Server(Transact-SQL)的表中添加列的语法为:

ALTER TABLE table_name
  ADD column_name column_definition;

让我们看一个示例,该示例显示如何使用ALTER TABLE语句在SQL Server表中添加列。

例如:

ALTER TABLE customer
  ADD group VARCHAR(10);

此SQL Server ALTER TABLE示例将在客户表中添加一列,称为group。