SQL Server:仅在使用GROUP BY

时间:2015-10-27 08:38:53

标签: sql-server group-by

我有一张表,其中包含医院咒语中每个病房的记录(注意:一个咒语可以包括转移到其他医院)。 Spellno是一个咒语的唯一标识符。我想将一个咒语中的连续病房住院汇总到医院水平。我的问题是,如果患者从医院1到医院2并返回医院1 GROUP BY' Spellno'和'医院'会结合我住院的两次住院,这是我不想做的。

e.g。如果这是我的数据:

Spellno   Hospital   WardCode   WardStart   WardEnd 
-------------------------------------------------------------------
123       hosp1      ward1      01/04/2015  03/04/2015
123       hosp1      ward4      03/04/2015  05/04/2015
123       hosp2      ward2      05/04/2015  07/04/2015
123       hosp1      ward3      07/04/2015  10/04/2015
123       hosp1      ward1      10/04/2015  12/04/2015

我想在Spellno和Hospital上聚合得到:

Spellno   Hospital   WardStart   WardEnd 
-------------------------------------------------------------------
123       hosp1      01/04/2015  05/04/2015
123       hosp2      05/04/2015  07/04/2015
123       hosp1      07/04/2015  12/04/2015

非常感谢提前。

2 个答案:

答案 0 :(得分:4)

您可以在WHERE子句中使用子查询来过滤掉SELECT中的重叠日期范围和第二个子查询,以获得范围结束。

SELECT Spellno, Hospital,D.WardStart,
   (SELECT Min(E.WardEnd)
    FROM #tab E
    WHERE E.WardEnd >= D.WardEnd
      AND E.Spellno = D.Spellno
      AND E.Hospital = D.Hospital
      AND NOT EXISTS (SELECT 1
                      FROM #tab E2
                      WHERE E.WardStart < E2.WardStart
                        AND E.WardEnd >= E2.WardStart
                        AND D.Spellno = E2.Spellno
                        AND D.Hospital = E2.Hospital)
  ) AS WardEnd
FROM #tab D
WHERE NOT EXISTS (SELECT 1
                  FROM #tab D2
                  WHERE D.WardStart <= D2.WardEnd
                    AND D.WardEnd > D2.WardEnd
                    AND D.Spellno = D2.Spellno
                    AND D.Hospital = D2.Hospital)

警告:

此查询性能可能不是最佳,但它可以完成工作。

LiveDemo

输出:

╔═════════╦══════════╦═════════════════════╦═════════════════════╗
║ Spellno ║ Hospital ║      WardStart      ║       WardEnd       ║
╠═════════╬══════════╬═════════════════════╬═════════════════════╣
║     123 ║ hosp1    ║ 2015-04-01 00:00:00 ║ 2015-04-05 00:00:00 ║
║     123 ║ hosp2    ║ 2015-04-05 00:00:00 ║ 2015-04-07 00:00:00 ║
║     123 ║ hosp1    ║ 2015-04-07 00:00:00 ║ 2015-04-12 00:00:00 ║
╚═════════╩══════════╩═════════════════════╩═════════════════════╝

答案 1 :(得分:1)

我假设(WardStart, WardEnd)日期范围是严格连续的,没有重叠。为简单起见,我还假设连续范围不超过max recursion default

这可以使用递归SQL来解决:

WITH 
  data AS (
    SELECT * 
    FROM (
      VALUES (123, 'hosp1', 'ward1', CAST('2015-04-01' AS DATE), CAST('2015-04-03' AS DATE)),
             (123, 'hosp1', 'ward4', CAST('2015-04-03' AS DATE), CAST('2015-04-05' AS DATE)),
             (123, 'hosp2', 'ward2', CAST('2015-04-05' AS DATE), CAST('2015-04-07' AS DATE)),
             (123, 'hosp1', 'ward3', CAST('2015-04-07' AS DATE), CAST('2015-04-10' AS DATE)),
             (123, 'hosp1', 'ward1', CAST('2015-04-10' AS DATE), CAST('2015-04-12' AS DATE))
    ) AS t(Spellno, Hospital, WardCode, WardStart, WardEnd)
  ),
  consecutive(Spellno, Hospital, WardStart, WardEnd) AS (
    SELECT Spellno, Hospital, WardStart, WardEnd
    FROM data AS d1
    WHERE NOT EXISTS (
      SELECT *
      FROM data AS d2
      WHERE d1.Spellno = d2.Spellno
      AND d1.Hospital = d2.Hospital
      AND d1.WardStart = d2.WardEnd
    )
    UNION ALL
    SELECT c.Spellno, c.Hospital, c.WardStart, d.WardEnd
    FROM consecutive AS c
    JOIN data AS d
    ON c.Spellno = d.Spellno
    AND c.Hospital = d.Hospital
    AND c.WardEnd = d.WardStart
  )
SELECT Spellno, Hospital, WardStart, MAX(WardEnd)
FROM consecutive
GROUP BY Spellno, Hospital, WardStart
ORDER BY Spellno, WardStart

Demo

解释

递归CTE consecutive中的第一个子查询将递归初始化为所有行,其中没有任何&#34;前一行&#34;对于相同的(Spellno, Hospital)。这会产生:

Spellno  Hospital  WardStart   WardEnd
-----------------------------------------
123      hosp1     2015-04-01  2015-04-03
123      hosp2     2015-04-05  2015-04-07
123      hosp1     2015-04-07  2015-04-10

然后递归产生一个新行,其中前一行WardStart(连续行总是相同)和当前WardEnd。这会产生:

Spellno  Hospital  WardStart   WardEnd
-----------------------------------------
123      hosp1     2015-04-01  2015-04-03 <-- Unwanted, "intermediary" row
123      hosp1     2015-04-01  2015-04-05
123      hosp2     2015-04-05  2015-04-07
123      hosp1     2015-04-07  2015-04-10 <-- Unwanted, "intermediary" row
123      hosp1     2015-04-07  2015-04-12

最后,在外部查询中,我们只为每个连续的系列选择WardEnd的最大值,产生想要的结果:

Spellno  Hospital  WardStart   WardEnd
-----------------------------------------
123      hosp1     2015-04-01  2015-04-05
123      hosp2     2015-04-05  2015-04-07
123      hosp1     2015-04-07  2015-04-12