我想按客户分组日期并为2个单独的值生成计数列(Flag = Y和Flag = N)。输入表如下所示:
Customer Date Flag
------- ------- -----
001 201201 Y
001 201202 Y
001 201203 Y
001 201204 N
001 201205 N
001 201206 Y
001 201207 Y
001 201208 Y
001 201209 N
002 201201 N
002 201202 Y
002 201203 Y
002 201205 N
输出应如下所示:
Customer MinDate MaxDate Count_Y
------- ------ ------- -------
001 201201 201203 3
001 201206 201208 3
002 201202 201203 2
如何编写SQL查询?任何形式的帮助表示赞赏!谢谢!
答案 0 :(得分:1)
您想要找到" Y"的连续值。这是一个"缺口和岛屿"问题,有两种基本方法:
row_number()
值的差异进行计算。第一个依赖于SQL Server 2012+并且您还没有指定版本。所以,第二个看起来像这样:
select customer, min(date) as mindate, max(date) as maxdate,
count(*) as numYs
from (select t.*,
row_number() over (partition by customer order by date) as seqnum_cd,
row_number() over (partition by customer, flag order by date) as seqnum_cfd
from t
) t
where flag = 'Y'
group by customer, (seqnum_cd - seqnum_cfd), flag;
解释这是如何工作有点棘手。根据我的经验,想一想,如果你运行子查询,你会看到如何计算seqnum列并且"得到它"通过观察结果。
注意:这假设每天最多只有一条记录。如果还有更多内容,您可以使用dense_rank()
代替row_number()
来获得相同的效果。
答案 1 :(得分:0)
尝试使用以下查询,它将为您提供您想要的内容。
DROP TABLE [GroupCustomer]
GO
CREATE TABLE [dbo].[GroupCustomer](
Customer VARCHAR(50),
[Date] [datetime] NULL,
Flag VARCHAR(1)
)
INSERT INTO [dbo].[GroupCustomer] (Customer ,[Date],Flag)
VALUES ('001','201201','Y'),('001','201202','Y'),
('001','201203','Y'),('001','201204','N'),
('001','201205','N'),('001','201206','Y'),
('001','201207','Y'),('001','201208','Y'),
('001','201209','N'),('002','201201','N'),
('002','201202','Y'),('002','201203','Y'),
('002','201205','N')
GO
;WITH cte_cnt
AS
(
SELECT Customer,Format(MIN([Date]),'yyMMdd') AS MinDate
,Format(MAX([Date]),'yyMMdd') AS MaxDate
, COUNT('A') AS Count_Y
FROM (
SELECT Customer,Flag,[Date],
ROW_NUMBER() OVER(Partition by customer ORDER BY [Date]) AS ROW_NUMBER,
DATEDIFF(D, ROW_NUMBER() OVER(Partition by customer ORDER BY [Date])
, [Date]) AS Diff
FROM [GroupCustomer]
WHERE Flag='Y') AS dt
GROUP BY Customer,Flag, Diff )
SELECT *
FROM cte_cnt c
ORDER BY Customer
GO