计算顺序事件和序列计数SQL

时间:2014-06-12 18:57:36

标签: sql sql-server-2008-r2 parent-child common-table-expression row-number

我使用找到的答案here建立了一个查询,这非常有帮助。我已经添加了一些东西来满足我的需求。我添加的一件事是ROW_NUMBER(),以便计算在任何时间长度内30天内有人重新接收了多少次。我已按照第一个答案中的建议和问题that was posted herecte结果插入到临时表中。这并没有解决思路,序列长度和序列计数问题。

这是查询:

-- CREATE TABLE TO STORE CTE RESULTS
DECLARE @PPR TABLE(
    VISIT1      VARCHAR(20) -- THIS IS A UNIQUE VALUE
    , READMIT   VARCHAR(20) -- THIS IS A UNIQUE VALUE
    , MRN       VARCHAR(10) -- THIS IS UNIQUE TO A PERSON
    , INIT_DISC DATETIME
    , RA_ADM    DATETIME
    , R1        INT
    , R2        INT
    , INTERIM1  VARCHAR(20)
    , RA_COUNT  INT
    , FLAG      VARCHAR(2)
);

-- THE CTE THAT WILL GET USED TO POPULATE THE ABOVE TABLE
WITH cte AS (
  SELECT PTNO_NUM
    , Med_Rec_No
    , Dsch_Date
    , Adm_Date
    , ROW_NUMBER() OVER (
                         PARTITION BY MED_REC_NO 
                         ORDER BY PtNo_Num
                         ) AS r

  FROM smsdss.BMH_PLM_PtAcct_V

  WHERE Plm_Pt_Acct_Type = 'I'
  AND PtNo_Num < '20000000' 
  )

-- INSERT CTE RESULTS INTO PPR TABLE
INSERT INTO @PPR
SELECT
c1.PtNo_Num                                AS [INDEX]
, c2.PtNo_Num                              AS [READMIT]
, c1.Med_Rec_No                            AS [MRN]
, c1.Dsch_Date                             AS [INITIAL DISCHARGE]
, c2.Adm_Date                              AS [READMIT DATE]
, C1.r
, C2.r
, DATEDIFF(DAY, c1.Dsch_Date, c2.Adm_Date) AS INTERIM1
, ROW_NUMBER() OVER (
                    PARTITION BY C1.MED_REC_NO 
                    ORDER BY C1.PTNO_NUM ASC
                    ) AS [RA COUNT]

, CASE 
    WHEN DATEDIFF(DAY, c1.Dsch_Date, c2.Adm_Date) <= 30 
    THEN 1 
    ELSE 0
  END [FLAG]

FROM cte       C1
INNER JOIN cte C2
ON C1.Med_Rec_No = C2.Med_Rec_No

WHERE C1.Adm_Date <> C2.Adm_Date
AND C1.r + 1 = C2.r

ORDER BY C1.Med_Rec_No, C1.Dsch_Date

-- MANIPULATE PPR TABLE
SELECT PPR.VISIT1
, PPR.READMIT
, PPR.MRN
, PPR.INIT_DISC
, PPR.RA_ADM
--, PPR.R1
--, PPR.R2
, PPR.INTERIM1
--, PPR.RA_COUNT
, PPR.FLAG
-- THE BELOW DOES NOT WORK AT ALL
, CASE
    WHILE (SELECT PPR.INTERIM1 FROM @PPR PPR) <= 30
    BEGIN
        ROW_NUMBER() OVER (PARTITION BY PPR.MRN, PPR.VISIT1
                           ORDER BY PPR.VISIT1
                           )
        IF (SELECT PPR.INTERIM1 FROM @PPR PPR) > 30
            BREAK
    END
  END


FROM @PPR PPR

WHERE PPR.MRN = 'A NUMBER'

当前输出示例:

INDEX | READMIT | MRN | INIT DISCHARGE | RA DATE   | INTERIM | RACOUNT | FLAG | FLAG_2
12345 | 12349   | 123 | 2005-07-05     | 2005-07-09| 4       | 1       | 1    | 0
12349 | 12351   | 123 | 2005-07-11     | 2005-07-15| 4       | 2       | 1    | 0

因此,第三行显然不是30天内的重新接受,而只是患者回到医院的时间点,因此RA_Count回到1并且标志变为0,因为它不是30天重新接受。

我应该创建一个表而不是使用cte吗?

我想要添加的是链长和链数。以下是一些定义:

链长:在随后的访问后30天内连续多少次重新入院。

例如

INDEX | READMIT | MRN  | INITIAL DISCHARGE    | READMIT DATE | CHAIN LEN | Count
123   | 133     | 1236 | 2009-05-13           | 2009-06-12   | 1         | 1
133   | 145     | 1236 | 2009-06-16           | 2009-07-04   | 2         | 1
145   | 157     | 1236 | 2009-07-06           | 2009-07-15   | 3         | 1
165   | 189     | 1236 | 2011-01-01           | 2011-01-12   | 1         | 2
189   | 195     | 1236 | 2011-02-06           | 2011-03-01   | 2         | 2 

链计数将是多少链:所以在上表中会有2.我试图使用case语句来建立链长

这是一个SQL小提琴,其中包含一些示例数据,因为它将在执行CTE之前显示SQL Fiddle

谢谢,

3 个答案:

答案 0 :(得分:4)

更新#1:如果两个事件之间的最大差异为30天,则会链接两个事件。 [COUNT]值是按人数生成的。

您可以调整以下使用recursive common table expression的示例:

CREATE TABLE dbo.Events (
    EventID INT IDENTITY(1,1) PRIMARY KEY,
    EventDate DATE NOT NULL,
    PersonID INT NOT NULL
);
GO
INSERT dbo.Events (EventDate, PersonID)
VALUES 
    ('2014-01-01', 1), ('2014-01-05', 1), ('2014-02-02', 1), ('2014-03-30', 1), ('2014-04-04', 1), 
    ('2014-01-11', 2), ('2014-02-02', 2),
    ('2014-01-03', 3), ('2014-03-03', 3);
GO

DECLARE @EventsWithNum TABLE (
    EventID INT NOT NULL,
    EventDate DATE NOT NULL,
    PersonID INT NOT NULL,
    EventNum INT NOT NULL,
    PRIMARY KEY (EventNum, PersonID)
);
INSERT  @EventsWithNum
SELECT  crt.EventID, crt.EventDate, crt.PersonID,
        ROW_NUMBER() OVER(PARTITION BY crt.PersonID ORDER BY crt.EventDate, crt.EventID) AS EventNum
FROM    dbo.Events crt;

WITH CountingSequentiaEvents
AS (
    SELECT  crt.EventID, crt.EventDate, crt.PersonID, crt.EventNum,
            1 AS GroupNum,
            1 AS GroupEventNum
    FROM    @EventsWithNum crt
    WHERE   crt.EventNum = 1

    UNION ALL 

    SELECT  crt.EventID, crt.EventDate, crt.PersonID, crt.EventNum,
            CASE 
                WHEN DATEDIFF(DAY, prev.EventDate, crt.EventDate) <= 30 THEN prev.GroupNum
                ELSE prev.GroupNum + 1 
            END AS GroupNum,
            CASE 
                WHEN DATEDIFF(DAY, prev.EventDate, crt.EventDate) <= 30 THEN prev.GroupEventNum + 1
                ELSE 1 
            END AS GroupEventNum
    FROM    @EventsWithNum crt JOIN CountingSequentiaEvents prev ON crt.PersonID = prev.PersonID
    AND     crt.EventNum = prev.EventNum + 1
)
SELECT  x.EventID, x.EventDate, x.PersonID,
        x.GroupEventNum AS [CHAIN LEN],
        x.GroupNum AS [Count]
FROM    CountingSequentiaEvents x
ORDER BY x.PersonID, x.EventDate
-- 1000 means 1000 + 1 = maximum 1001 events / person
OPTION (MAXRECURSION 1000); -- Please read http://msdn.microsoft.com/en-us/library/ms175972.aspx (section Guidelines for Defining and Using Recursive Common Table Expressions)

输出:

EventID EventDate  PersonID CHAIN LEN Count
------- ---------- -------- --------- -----
1       2014-01-01 1        1         1
2       2014-01-05 1        2         1
3       2014-02-02 1        3         1
------- ---------- -------- --------- -----
4       2014-03-30 1        1         2
5       2014-04-04 1        2         2
------- ---------- -------- --------- -----
6       2014-01-11 2        1         1
7       2014-02-02 2        2         1
------- ---------- -------- --------- -----
8       2014-01-03 3        1         1
------- ---------- -------- --------- -----
9       2014-03-03 3        1         2
------- ---------- -------- --------- -----

如你所见

enter image description here

对于最后一个语句,执行计划包含两个Index Seek运算符,因为PRIMARY KEY (EventNum, PersonID)上定义的此约束@EventsWithNum强制SQL Server创建(在本例中)带有a的聚簇索引复合键EventNum, PersonID

此外,我们可以看到INSERT @EventsWithNum ...的估算费用高于WITH CountingSequentiaEvents (...) SELECT ... FROM CountingSequentiaEvents ...的估算费用。

答案 1 :(得分:2)

解决方案是从一个视图开始,该视图总结了每个VisitID的数据。我使用了一个视图而不是CTE,因为它似乎是你在一次以上的时间里找到我们的东西。

    create view vReadmits
    as
    select t.VisitID,
    t.UID,
    min(r.AdmitDT) ReadmittedDT, 
    min(r.VisitID) NextVisitID,
    sum(case when r.AdmitDT < dateadd(d, 30, isnull(t.DischargeDT, t.AdmitDT)) 
        then 1 else 0 end) ReadmitNext30
    from t 
    left join t as r
    on t.UID = r.UID
    and t.VisitID < r.VisitID
    group by t.VisitID,
    t.UID

这将获取每个VisitID并找到该UID的下一个VisitID。同时,它总结了不到30天的未来访问。它使用ISNULL()来解释丢失的DischargeDT。

然后,您可以在CTE中添加链的逻辑。然后,您可以加入视图和CTE以在视图中包含列。

with Chains as
(
select v.UID,
sum(case when r.ReadmittedDT < dateadd(d, 30, v.ReadmittedDT)
then 0 else 1 end) as ChainCount    
from vReadmits v
left join vReadmits r
on r.NextVisitID = v.VisitID
group by v.UID
)

select t.UID, 
t.VisitId, 
t.AdmitDT, 
t.DischargeDT, 
v.NextVisitID, 
v.ReadmitNext30, 
v.ReadmittedDT,
c.ChainCount
from t 
join vReadmits v
on t.VisitID = v.VisitID
inner join Chains c
on v.UID = c.UID
order by t.UID, t.VisitID

以下是SQLFiddle

我所做的假设是,如果VisitID大于另一个,那么它的AdmitDT也会更大。这应该是这种情况(特别是对于相同的UID),但如果不是,您将更改视图以使用AdmitDTs而不是VisitID。

答案 2 :(得分:1)

要记住关于CTE的一个缺点。

每次引用它们时,它都会重新执行查询。

在您的查询中,您将引用CTE两次(c1和c2)。这意味着它正在执行两次查询。最好将CTE结果存储在表变量或临时表中,然后执行连接。