我有一张表,其中包含医院咒语中每个病房的记录(注意:一个咒语可以包括转移到其他医院)。 Spellno是一个咒语的唯一标识符。我想将一个咒语中的连续病房住院汇总到医院水平。我的问题是,如果患者从医院1到医院2并返回医院1 GROUP BY
' Spellno'和'医院'会结合我住院的两次住院,这是我不想做的。
e.g。如果这是我的数据:
Spellno Hospital WardCode WardStart WardEnd
-------------------------------------------------------------------
123 hosp1 ward1 01/04/2015 03/04/2015
123 hosp1 ward4 03/04/2015 05/04/2015
123 hosp2 ward2 05/04/2015 07/04/2015
123 hosp1 ward3 07/04/2015 10/04/2015
123 hosp1 ward1 10/04/2015 12/04/2015
我想在Spellno和Hospital上聚合得到:
Spellno Hospital WardStart WardEnd
-------------------------------------------------------------------
123 hosp1 01/04/2015 05/04/2015
123 hosp2 05/04/2015 07/04/2015
123 hosp1 07/04/2015 12/04/2015
非常感谢提前。
答案 0 :(得分:4)
您可以在WHERE
子句中使用子查询来过滤掉SELECT
中的重叠日期范围和第二个子查询,以获得范围结束。
SELECT Spellno, Hospital,D.WardStart,
(SELECT Min(E.WardEnd)
FROM #tab E
WHERE E.WardEnd >= D.WardEnd
AND E.Spellno = D.Spellno
AND E.Hospital = D.Hospital
AND NOT EXISTS (SELECT 1
FROM #tab E2
WHERE E.WardStart < E2.WardStart
AND E.WardEnd >= E2.WardStart
AND D.Spellno = E2.Spellno
AND D.Hospital = E2.Hospital)
) AS WardEnd
FROM #tab D
WHERE NOT EXISTS (SELECT 1
FROM #tab D2
WHERE D.WardStart <= D2.WardEnd
AND D.WardEnd > D2.WardEnd
AND D.Spellno = D2.Spellno
AND D.Hospital = D2.Hospital)
警告:的
此查询性能可能不是最佳,但它可以完成工作。
的 LiveDemo
强>
输出:
╔═════════╦══════════╦═════════════════════╦═════════════════════╗
║ Spellno ║ Hospital ║ WardStart ║ WardEnd ║
╠═════════╬══════════╬═════════════════════╬═════════════════════╣
║ 123 ║ hosp1 ║ 2015-04-01 00:00:00 ║ 2015-04-05 00:00:00 ║
║ 123 ║ hosp2 ║ 2015-04-05 00:00:00 ║ 2015-04-07 00:00:00 ║
║ 123 ║ hosp1 ║ 2015-04-07 00:00:00 ║ 2015-04-12 00:00:00 ║
╚═════════╩══════════╩═════════════════════╩═════════════════════╝
答案 1 :(得分:1)
我假设(WardStart, WardEnd)
日期范围是严格连续的,没有重叠。为简单起见,我还假设连续范围不超过max recursion default。
这可以使用递归SQL来解决:
WITH
data AS (
SELECT *
FROM (
VALUES (123, 'hosp1', 'ward1', CAST('2015-04-01' AS DATE), CAST('2015-04-03' AS DATE)),
(123, 'hosp1', 'ward4', CAST('2015-04-03' AS DATE), CAST('2015-04-05' AS DATE)),
(123, 'hosp2', 'ward2', CAST('2015-04-05' AS DATE), CAST('2015-04-07' AS DATE)),
(123, 'hosp1', 'ward3', CAST('2015-04-07' AS DATE), CAST('2015-04-10' AS DATE)),
(123, 'hosp1', 'ward1', CAST('2015-04-10' AS DATE), CAST('2015-04-12' AS DATE))
) AS t(Spellno, Hospital, WardCode, WardStart, WardEnd)
),
consecutive(Spellno, Hospital, WardStart, WardEnd) AS (
SELECT Spellno, Hospital, WardStart, WardEnd
FROM data AS d1
WHERE NOT EXISTS (
SELECT *
FROM data AS d2
WHERE d1.Spellno = d2.Spellno
AND d1.Hospital = d2.Hospital
AND d1.WardStart = d2.WardEnd
)
UNION ALL
SELECT c.Spellno, c.Hospital, c.WardStart, d.WardEnd
FROM consecutive AS c
JOIN data AS d
ON c.Spellno = d.Spellno
AND c.Hospital = d.Hospital
AND c.WardEnd = d.WardStart
)
SELECT Spellno, Hospital, WardStart, MAX(WardEnd)
FROM consecutive
GROUP BY Spellno, Hospital, WardStart
ORDER BY Spellno, WardStart
递归CTE consecutive
中的第一个子查询将递归初始化为所有行,其中没有任何&#34;前一行&#34;对于相同的(Spellno, Hospital)
。这会产生:
Spellno Hospital WardStart WardEnd
-----------------------------------------
123 hosp1 2015-04-01 2015-04-03
123 hosp2 2015-04-05 2015-04-07
123 hosp1 2015-04-07 2015-04-10
然后递归产生一个新行,其中前一行WardStart
(连续行总是相同)和当前WardEnd
。这会产生:
Spellno Hospital WardStart WardEnd
-----------------------------------------
123 hosp1 2015-04-01 2015-04-03 <-- Unwanted, "intermediary" row
123 hosp1 2015-04-01 2015-04-05
123 hosp2 2015-04-05 2015-04-07
123 hosp1 2015-04-07 2015-04-10 <-- Unwanted, "intermediary" row
123 hosp1 2015-04-07 2015-04-12
最后,在外部查询中,我们只为每个连续的系列选择WardEnd
的最大值,产生想要的结果:
Spellno Hospital WardStart WardEnd
-----------------------------------------
123 hosp1 2015-04-01 2015-04-05
123 hosp2 2015-04-05 2015-04-07
123 hosp1 2015-04-07 2015-04-12