SQL组由数据与行分开

时间:2016-08-20 06:22:48

标签: sql sql-server

我想按客户分组日期并为2个单独的值生成计数列(Flag = Y和Flag = N)。输入表如下所示:

Customer  Date   Flag
------- ------- -----
001      201201  Y
001      201202  Y
001      201203  Y
001      201204  N
001      201205  N
001      201206  Y
001      201207  Y
001      201208  Y
001      201209  N
002         201201  N
002         201202  Y
002         201203  Y
002         201205  N

输出应如下所示:

Customer MinDate  MaxDate Count_Y
------- ------ ------- -------
001     201201  201203     3  
001     201206  201208     3     
002     201202  201203     2     

如何编写SQL查询?任何形式的帮助表示赞赏!谢谢!

2 个答案:

答案 0 :(得分:1)

您想要找到" Y"的连续值。这是一个"缺口和岛屿"问题,有两种基本方法:

  • 确定第一个" Y"在每个组中并使用此信息来定义一组连续的" Y"值。
  • 使用row_number()值的差异进行计算。

第一个依赖于SQL Server 2012+并且您还没有指定版本。所以,第二个看起来像这样:

select customer, min(date) as mindate, max(date) as maxdate,
       count(*) as numYs
from (select t.*,
             row_number() over (partition by customer order by date) as seqnum_cd,
             row_number() over (partition by customer, flag order by date) as seqnum_cfd
      from t
     ) t
where flag = 'Y'
group by customer, (seqnum_cd - seqnum_cfd), flag;

解释这是如何工作有点棘手。根据我的经验,想一想,如果你运行子查询,你会看到如何计算seqnum列并且"得到它"通过观察结果。

注意:这假设每天最多只有一条记录。如果还有更多内容,您可以使用dense_rank()代替row_number()来获得相同的效果。

答案 1 :(得分:0)

尝试使用以下查询,它将为您提供您想要的内容。

DROP TABLE [GroupCustomer]
GO

CREATE TABLE [dbo].[GroupCustomer](
     Customer VARCHAR(50),
     [Date] [datetime] NULL,
     Flag VARCHAR(1)
       )

INSERT INTO [dbo].[GroupCustomer]  (Customer ,[Date],Flag)
VALUES   ('001','201201','Y'),('001','201202','Y'),
         ('001','201203','Y'),('001','201204','N'),
         ('001','201205','N'),('001','201206','Y'),
         ('001','201207','Y'),('001','201208','Y'),
         ('001','201209','N'),('002','201201','N'),
         ('002','201202','Y'),('002','201203','Y'),
         ('002','201205','N')
GO


;WITH cte_cnt
AS
(
 SELECT Customer,Format(MIN([Date]),'yyMMdd') AS MinDate
   ,Format(MAX([Date]),'yyMMdd') AS MaxDate
   , COUNT('A') AS Count_Y
 FROM (
     SELECT Customer,Flag,[Date],
        ROW_NUMBER() OVER(Partition by customer ORDER BY [Date]) AS ROW_NUMBER,
        DATEDIFF(D, ROW_NUMBER() OVER(Partition by customer ORDER BY [Date])
        , [Date]) AS Diff
    FROM [GroupCustomer]
    WHERE Flag='Y') AS dt
   GROUP BY Customer,Flag, Diff )
SELECT *
FROM  cte_cnt  c
ORDER BY Customer

GO