我有一个包含3列的表(id(int),date(date),Status(bool))。
像这样id date Status
1 2012-10-18 1
1 2012-10-19 1
1 2012-10-20 0
1 2012-10-21 0
1 2012-10-22 0
1 2012-10-23 0
1 2012-10-24 1
1 2012-10-25 0
1 2012-10-26 0
1 2012-10-27 0
1 2012-10-28 1
2 2012-10-19 0
2 2012-10-20 0
2 2012-10-21 0
2 2012-10-22 1
2 2012-10-23 1
假设日期列是顺序的,日期之间没有差距。
如何查找所有3个连续零(状态列中)及其下一天状态?
像这样id startDate endDate NextDayStatus
1 2012-10-20 2012-10-22 0
1 2012-10-21 2012-10-23 1
1 2012-10-25 2012-10-27 1
2 2012-10-19 2012-10-21 1
表创建脚本和示例数据
CREATE TABLE [Table1](
[ID] [smallint] NOT NULL,
[Date] [date] NOT NULL,
[Status] [bit] NULL,
CONSTRAINT [PK_table1] PRIMARY KEY CLUSTERED ( [ID] ASC, [Date] ASC ) )
INSERT INTO [Table1]([ID], [Date], [Status])
SELECT 1, '2012-10-18', 1 UNION ALL
SELECT 1, '2012-10-19', 1 UNION ALL
SELECT 1, '2012-10-20', 0 UNION ALL
SELECT 1, '2012-10-21', 0 UNION ALL
SELECT 1, '2012-10-22', 0 UNION ALL
SELECT 1, '2012-10-23', 0 UNION ALL
SELECT 1, '2012-10-24', 1 UNION ALL
SELECT 1, '2012-10-25', 0 UNION ALL
SELECT 1, '2012-10-26', 0 UNION ALL
SELECT 1, '2012-10-27', 0 UNION ALL
SELECT 1, '2012-10-28', 1 UNION ALL
SELECT 2, '2012-10-19', 0 UNION ALL
SELECT 2, '2012-10-20', 0 UNION ALL
SELECT 2, '2012-10-21', 0 UNION ALL
SELECT 2, '2012-10-22', 1 UNION ALL
SELECT 2, '2012-10-23', 1
更新:
答案 0 :(得分:4)
编辑,更新ID分区
如果日期不连续,这也适用
SELECT T1.id, T1.[Date], MAX(X.[Date]), Y.[Status]
FROM Table1 T1
CROSS APPLY
( SELECT TOP 3 *
FROM Table1 T2
WHERE T2.id = T1.id AND T2.[Date] >= T1.Date
ORDER BY T2.[Date]
) X
CROSS APPLY
( SELECT TOP 4 *, ROW_NUMBER() OVER (PARTITION BY id ORDER BY T3.[Date]) AS rn
FROM Table1 T3
WHERE T3.id = T1.id AND T3.[Date] >= T1.Date
ORDER BY T3.[Date]
) Y
WHERE y.rn = 4
GROUP BY T1.id, T1.[Date], Y.[Status]
HAVING SUM(CAST(X.[Status] AS tinyint)) = 0;
为了完整性,这是更优雅的SQL Server 2012解决方案的方式 这可以与任何具有适当窗口/分析支持的RDBMS一起使用
SELECT
X.id, X.startDate, X.endDate, x.nextStatus
FROM
( SELECT T1.id, T1.[Date] AS startDate,
LEAD(T1.[Date], 2) OVER (PARTITION BY T1.id ORDER BY T1.[Date]) AS endDate,
LEAD(T1.[Status], 3) OVER (PARTITION BY T1.id ORDER BY T1.[Date]) AS nextStatus,
SUM(CAST(T1.[Status] AS tinyint)) OVER (PARTITION BY T1.id ORDER BY T1.[Date] ROWS BETWEEN CURRENT ROW AND 2 FOLLOWING) AS SumNext3
FROM Table1 T1
) X
WHERE SumNext3 = 0;
答案 1 :(得分:3)
SELECT
z1.id, z1.[date] AS startDate ,z3.[date] AS endDate, zn.status AS NextDayStatus
FROM
Table1 z1
INNER JOIN Table1 z2 ON z2.[date] = (
SELECT MIN([date]) FROM Table1 WHERE [date] > z1.[date] AND id = z1.id
)
INNER JOIN Table1 z3 ON z3.Date = (
SELECT MIN([date]) FROM Table1 WHERE [date] > z2.[date] AND id = z1.id
)
INNER JOIN Table1 zn ON zn.Date = (
SELECT MIN([date]) FROM Table1 WHERE [date] > z3.[date] AND id = z1.id
)
WHERE
z1.status = 0
AND z2.status = 0 AND z2.id = z1.id
AND z3.status = 0 AND z3.id = z1.id
AND zn.id = z1.id
ORDER BY
z1.id, z1.[date]
Table1 (date, status, id)
上的索引是最佳的。
答案 2 :(得分:2)
这是另一种解决方案,它也适用于许多SQL产品(支持窗口功能的产品),但特别是在SQL Server 2005及更高版本上:
WITH partitioned AS (
SELECT
*,
grp = DATEDIFF(DAY, 0, Date)
- ROW_NUMBER() OVER (PARTITION BY ID, Status ORDER BY Date)
FROM Table1
),
grouped AS (
SELECT
ID,
SD = MIN(Date),
ED = MAX(Date)
FROM partitioned
WHERE Status = 0
GROUP BY
ID,
grp
HAVING COUNT(*) >= 3
)
SELECT
t.ID,
StartDate = t.Date,
EndDate = DATEADD(DAY, 2, t.Date),
NextDayStatus = CASE t.Date WHEN DATEADD(DAY, -2, g.ED) THEN 1 ELSE 0 END
FROM Table1 t
INNER JOIN grouped g
ON t.ID = g.ID AND t.Date BETWEEN g.SD AND DATEADD(DAY, -2, g.ED)
;
我们的想法是检测Status = 0
的所有“孤岛”,挑选那些至少有3行的岛屿,然后将聚合的岛屿集合回到原始表格,以获得符合条件的行的开头。所需的3个连续Status = 0
行的子集。
但需要注意的是:此解决方案假定任何3个连续的Status 0行后面至少有一个具有相同ID的其他行。换句话说,最后一组匹配的状态0行应该跟一个状态1行,这就是结果集所指示的行。