计算4次或更多次访问是否在45天之内发生

时间:2015-03-27 21:22:35

标签: sql loops sas

以下是示例数据。如果他们在45天之内有4次或更多次就诊,我只需要保留一位患者。我已经转换了数据集并使用数组来找出一种方法,但我希望有一种更有效的方法。

Pat_ID  Date        Prov_ID
    A       05/12/2012  X1
    A       05/12/2012  X2
    B       11/12/2012  X1
    B       11/20/2012  X1
    B       01/12/2013  X1
    B       03/22/2013  X1
    C       04/25/2013  X1
    C       04/25/2013  X2
    C       04/27/2013  X1
    C       05/12/2013  X1
    C       05/22/2013  X2
    C       04/25/2012  X3
    ...

我开始删除少于4个事件的观察。

任何想法都会受到赞赏。

结束结果应该是一个数据集,其中只有PAT_ID在45天内有4次或更多次访问。

4 个答案:

答案 0 :(得分:5)

这是使用lag函数的基于SAS(而不是SQL)的解决方案。它只读取一次数据,所以应该非常有效,特别是与自我式的解决方案相比。

首先按ID和访问日期对数据进行排序(如果它还没有)

proc sort data=YourData; 
    by Pat_ID Date;
run;

如果您追溯3条记录中的Pat_ID和日期,您可以测试它是否在45天内并且是同一位患者。如果是这样 - 将其添加到列表中。

data list_of_membs(keep=PAT_ID);
    set YourData;
    retain last_Pat_ID;*The last PAT_ID that was added to the list;

    pat_id_3back = lag3(Pat_ID); *PAT_ID from 3 records back;
    date_3back = lag3(date);

    If  pat_id = pat_id_3back 
        AND (date - date_3back) < 45 
        AND (PAT_ID != last_Pat_ID) THEN DO;
            output;
            last_PAT_ID = PAT_ID;
    END;
run;

答案 1 :(得分:0)

我认为你应该能够通过自我加入结果并计算同一患者在该行日期的45天内发生日期的次数来实现这一目标。

以下代码未经测试,但希望能让您朝着正确的方向前进。

WITH X AS (
--your original query here
)

SELECT
X.*,
COUNT(Y.Date) over (PARTITION BY X.Pat_ID, X.VisitID) as [# of visits within 45 days after this one]
INTO #Y
FROM X
INNER JOIN X as Y on X.Pat_ID = Y.Pat_ID and Y.Date > X.Date and Y.Date < DATEADD(d,45,X.Date)

SELECT DISTINCT
*
FROM #Y
WHERE [# of visits within 45 days after this one] >= 3

DROP TABLE #Y

答案 2 :(得分:0)

编辑:当我第一次回答这个问题时,我认为结果应该是原始表中彼此相隔45天的记录。我现在已经包含了Tim Sand的答案的SQL Server实现(一个没有自我加入),以防将来有人需要SQL Server版本。我认为值得一提的是,如果除了pat_id之外需要从表中获取任何数据,那么它将需要加入表中才能获得它。

DECLARE @PatientVisits TABLE
(
    VisitID INT PRIMARY KEY IDENTITY
    ,Pat_ID  VARCHAR(5)
    ,Date   DATETIME2
    ,Prov_ID VARCHAR(5)
)

INSERT INTO @PatientVisits
VALUES

    ('A'       ,'05/12/2012'  ,'X1')
   ,('A'       ,'05/12/2012'  ,'X2')
   ,('B'       ,'11/12/2012'  ,'X1')
   ,('B'       ,'11/20/2012'  ,'X1')
   ,('B'       ,'01/12/2013'  ,'X1')
   ,('B'       ,'03/22/2013'  ,'X1')
   ,('C'       ,'04/25/2013'  ,'X1')
   ,('C'       ,'04/25/2013'  ,'X2')
   ,('C'       ,'04/27/2013'  ,'X1')
   ,('C'       ,'05/12/2013'  ,'X1')
   ,('C'       ,'05/22/2013'  ,'X2')
   ,('C'       ,'04/25/2012'  ,'X3');

SELECT DISTINCT
    Pat_ID
FROM
(
    SELECT
        Pat_ID
        ,Date AS CurrentDate
        ,LAG (Date,3,'00:00') OVER ( PARTITION BY Pat_ID ORDER BY Date ASC ) AS DateThreeVisitsAgo
    FROM @PatientVisits
) Visits
WHERE 
    DATEDIFF(DAY,DateThreeVisitsAgo,CurrentDate) <= 45 

这是我原来的答案:

DECLARE @PatientVisits TABLE
(
    VisitID INT PRIMARY KEY IDENTITY
    ,Pat_ID  VARCHAR(5)
    ,Date   DATETIME2
    ,Prov_ID VARCHAR(5)
)

INSERT INTO @PatientVisits
VALUES

    ('A'       ,'05/12/2012'  ,'X1')
   ,('A'       ,'05/12/2012'  ,'X2')
   ,('B'       ,'11/12/2012'  ,'X1')
   ,('B'       ,'11/20/2012'  ,'X1')
   ,('B'       ,'01/12/2013'  ,'X1')
   ,('B'       ,'03/22/2013'  ,'X1')
   ,('C'       ,'04/25/2013'  ,'X1')
   ,('C'       ,'04/25/2013'  ,'X2')
   ,('C'       ,'04/27/2013'  ,'X1')
   ,('C'       ,'05/12/2013'  ,'X1')
   ,('C'       ,'05/22/2013'  ,'X2')
   ,('C'       ,'04/25/2012'  ,'X3');

SELECT 
    VisitID
    ,Pat_ID
    ,Date
    ,Prov_ID
FROM
(
    SELECT
        P.VisitID
        ,P.Pat_ID
        ,P.Date
        ,P.Prov_ID
        ,COUNT(*) OVER (PARTITION BY P.VisitID) AS NearVisitCount
    FROM @PatientVisits P
    JOIN @PatientVisits P2
        ON P.Pat_ID = P2.Pat_ID
        AND P.Date BETWEEN DATEADD(DAY,-45,P2.Date) AND DATEADD(DAY,45,P2.Date)

) Visits
WHERE
    NearVisitCount >= 4 
GROUP BY 
    VisitID
    ,Pat_ID
    ,Date
    ,Prov_ID

结果:

VisitID Pat_ID  Date        Prov_ID
7       C       2013-04-25  X1
8       C       2013-04-25  X2
9       C       2013-04-27  X1
10      C       2013-05-12  X1
11      C       2013-05-22  X2

答案 3 :(得分:0)

这个也适用:

    proc sql;
     create table want as 
     select a.Pat_ID, a.Date, a.Prov_ID from YourData as a
     left join  YourData as b
     on
      (
       a.pat_id = b.pat_id
       and  intck('day',b.date,a.date)>=45
      )
     join 
     (select pat_id, count(*) from YourData
      group by pat_id
      having count(*)>4
     ) as c
     on
     (
       a.pat_id = c.pat_id
     )
     where b.date is not missing;
   quit;