SQL:按分组日期范围递归CTE

时间:2016-09-12 15:34:33

标签: sql-server tsql

我对SQL很新,我在根据访问日期为患者分配点数时遇到问题。我认为递归CTE是实现这一目标的最好方法,但我不能完全理解它。

每次访问时,每个OrganizationMrn应分配1个点,除非在过去3天内再次访问。如果患者在3天内有多次就诊,则应分配一个点。

任何帮助修改我的查询到递归cte,如上所述分配点将非常感激。

示例:
患者有这7次就诊:1 / 1,1 / 2,1 / 3 1/4,1 / 5,1 / 6,1 / 11。该患者应分配3分:1分(1 / 1-1 / 4),1分(1 / 5-1 / 6),1分(1/11)。

查询说明: 患者25应该总共有5个点;
日期范围2015-10-02 - 2015-10-05应分配1分;
日期范围2015-11-08 - 2015-11-09应分配1分;
其他日期在3天内没有另一个日期,应该分配1分。

WITH CTE AS
    (
    SELECT 
    ROW_NUMBER() OVER (PARTITION BY OrganizationMrn ORDER BY [Date]) AS ROWNUMBER, *
    FROM #RC
    )
    SELECT *, 
    ISNULL(DATEDIFF(DY,(SELECT OTHER.[Date] FROM CTE OTHER WHERE OTHER.OrganizationMrn = CTE.OrganizationMrn AND OTHER.ROWNUMBER = CTE.ROWNUMBER - 1), CTE.[Date]),0) AS DaysFromLastVisit, 
    CASE WHEN ISNULL(DATEDIFF(DY,(SELECT OTHER.[Date] FROM CTE OTHER WHERE OTHER.OrganizationMrn = CTE.OrganizationMrn AND OTHER.ROWNUMBER = CTE.ROWNUMBER - 1), CTE.[Date]),0) > 3 THEN 1 END AS POINTS
    FROM CTE
    ORDER BY OrganizationMrn, [Date];

    ROWNUMBER   OrganizationMrn Date        DaysFromLastVisit   POINTS
    1           25              2015-10-02  0                   NULL
    2           25              2015-10-03  1                   NULL
    3           25              2015-10-05  2                   NULL
    4           25              2015-11-08  34                  1
    5           25              2015-11-09  1                   NULL
    6           25              2016-03-04  116                 1
    7           25              2016-05-04  61                  1
    8           25              2016-05-10  6                   1

这就是#RC的填充方式:

SELECT I.OrganizationMrn, CAST(R.DTTM AS DATE) AS Date
INTO #RC
FROM Standard SD 
    INNER JOIN Iorg I ON I.Person = SD.Patient
    INNER JOIN Result R ON I.Person = R.Patient
WHERE 
    R.Entry = 'note'
AND R.DTTM >= DATEADD(M,-12,GETDATE()) 
AND OrganizationMrn = '25'
ORDER BY I.OrganizationMrn;

OrganizationMrn Date
25              2015-10-02
25              2015-10-03
25              2015-10-05
25              2015-11-08
25              2015-11-09
25              2016-03-04
25              2016-05-04
25              2016-05-10

如何将此CASE语句修改为仅将点分配给3个日期之一?它目前正在为每天分配点数,10 / 2,10 / 3,10 / 5。

CASE 
WHEN ISNULL(DATEDIFF(DY,(SELECT OTHER.[Date] FROM CTE OTHER WHERE OTHER.OrganizationMrn = CTE.OrganizationMrn AND OTHER.ROWNUMBER = CTE.ROWNUMBER - 1), CTE.[Date]),0) <= 3 
THEN 1 ELSE 0 END AS POINTS

3 个答案:

答案 0 :(得分:1)

您是否需要输出每个组或者是否足以知道点值?如果是后者,您可以将其视为“差距和岛屿”问题的变体。如果你想要深入研究,有一篇很好的文章here。我正在调整该页面中的一个代码片段。

将起点定义为在记录前3天内没有记录的记录。终点是在3天之内没有记录的记录。一旦确定了每个岛屿,我们就可以获取起点和终点之间的天数,并通过对答案进行划分和舍入来确定适合其中的3天组。注意:以下代码是针对组织1的硬编码。

CREATE TABLE #t(
    OrganizationMrn int,
    VisitDate date)

INSERT #t(OrganizationMrn, VisitDate) VALUES 
    (1, '1/1/2016'),
    (1, '1/2/2016'),
    (1, '1/3/2016'),
    (1, '1/4/2016'),
    (1, '1/5/2016'),
    (1, '1/6/2016'),
    (1, '1/11/2016')

;WITH StartingPoints AS (
    SELECT VisitDate, ROW_NUMBER() OVER (ORDER BY VisitDate) AS Sequence FROM #t AS A 
    WHERE A.OrganizationMrn = 1 AND NOT EXISTS (
        SELECT * FROM #t AS B 
        WHERE B.OrganizationMrn = A.OrganizationMrn AND 
            B.VisitDate >= DATEADD(day, -4, A.VisitDate)
            AND 
            B.VisitDate < A.VisitDate
        )
),
EndingPoints AS (
    SELECT VisitDate, ROW_NUMBER() OVER (ORDER BY VisitDate) AS Sequence FROM #t AS A 
    WHERE A.OrganizationMrn = 1 AND NOT EXISTS (
        SELECT * FROM #t AS B 
        WHERE B.OrganizationMrn = A.OrganizationMrn AND 
            B.VisitDate <= DATEADD(day, 4, A.VisitDate)
            AND 
            B.VisitDate > A.VisitDate
        )
)

SELECT 
     S.VisitDate AS StartDate
    ,E.VisitDate AS EndDate 
    ,CEILING((DATEDIFF(day, S.VisitDate, E.VisitDate) + 1) / 4.0) AS Points
FROM 
    StartingPoints AS S 
    JOIN EndingPoints AS E ON (E.Sequence = S.Sequence)

答案 1 :(得分:1)

直接逻辑使用lag()窗口函数将第一个上一个日期转换为当前行,然后根据输出分配点或不使用条件sum()

当上一个日期不在日期和落后3天之间或者没有事件的预告日期时指定一个点:

select 
  organizationmrn, 
  sum(case when 
        prev_date not between dateadd(day, -3, date) and date 
        or prev_date is null 
      then 1 else 0 end) as points
from (
  select 
    *, 
    lag(date,1) over (partition by organizationmrn order by date) as prev_date
  from rc 
  ) calculate_prev_date
group by organizationmrn

结果

 organizationmrn | points
-----------------+--------
              25 |      5

答案 2 :(得分:0)

我也是 SQL 领域的新手。我从 https://bertwagner.com/posts/gaps-and-islands 找到了这篇文章,它很好地解释了如何解决这个“差距和岛屿问题”。

这是他对我有用的示例 SQL:

DROP TABLE IF EXISTS #OverlappingDateRanges;
CREATE TABLE #OverlappingDateRanges (StartDate date, EndDate date);

INSERT INTO #OverlappingDateRanges
SELECT '8/24/2017', '9/23/2017'  UNION ALL
SELECT '8/24/2017', '9/20/2017'  UNION ALL 
SELECT '9/23/2017', '9/27/2017'  UNION ALL 
SELECT '9/25/2017', '10/10/2017' UNION ALL
SELECT '10/17/2017','10/18/2017' UNION ALL 
SELECT '10/25/2017','11/3/2017'  UNION ALL 
SELECT '11/3/2017', '11/15/2017'


SELECT
    MIN(StartDate) AS IslandStartDate,
    MAX(EndDate) AS IslandEndDate
FROM (
    SELECT
        *,
        CASE WHEN Groups.PreviousEndDate >= StartDate THEN 0 ELSE 1 END AS IslandStartInd,
        SUM(CASE WHEN Groups.PreviousEndDate >= StartDate THEN 0 ELSE 1 END) OVER (ORDER BY Groups.RN) AS IslandId
    FROM (
        SELECT
            ROW_NUMBER() OVER(ORDER BY StartDate,EndDate) AS RN,
            StartDate,
            EndDate,
            LAG(EndDate,1) OVER (ORDER BY StartDate, EndDate) AS PreviousEndDate
        FROM
            #OverlappingDateRanges
    ) Groups
) Islands
GROUP BY
    IslandId
ORDER BY 
    IslandStartDate