我试图找到所有错过3次或更多连续约会的人。我相信我可以使用窗口功能来实现这一目标,但我陷入困境并寻求一些帮助。
以下是我要寻找的样本: 对于以下PATID = x001符合缺失1 / 12,1 / 14,1 / 15的标准 并且PATID = x002不符合标准。
PATID DEPT DATE STATUS
x001 A002 1/1/2016 Missed
x001 A002 1/5/2016 Complete
x001 A002 1/8/2016 Missed
x001 A002 1/10/2016 Complete
x001 A002 1/12/2016 Missed
x001 A002 1/14/2016 Missed
x001 A002 1/15/2016 Missed
x001 A002 1/19/2016 Complete
x002 A003 1/1/2016 Missed
x002 A003 1/5/2016 Complete
x002 A003 1/8/2016 Missed
x002 A003 1/10/2016 Complete
x002 A003 1/12/2016 Missed
x002 A003 1/14/2016 Complete
x002 A003 1/15/2016 Missed
x002 A003 1/19/2016 Complete
这是我到目前为止所拥有的。
SELECT
PR.PATID
, PR.DEPT
, PR.DATE
, CASE WHEN PR.STATUS IN (3,4) THEN 'Cancel' WHEN PR.STATUS = 2 THEN 'COMPLETED' ELSE 'ERROR' END AS STATUS, ROW_NUMBER () OVER (PARTITION BY PR.PAT_ID,PR.DEPARTMENT_ID ORDER BY R.PAT_ID,PR.DEPARTMENT_ID,PR.CONTACT_DATE) AS RN -- Just numbers the rows
, COUNT(*) OVER (PARTITION BY PR.PAT_ID,PR.DEPARTMENT_ID, CASE WHEN PR.APPT_STATUS_C IN (3,4) AND PR.CANCEL_REASON_C <> 4 THEN 'Cancel' WHEN PR.APPT_STATUS_C = 2 THEN 'COMPLETED' ELSE 'ERROR' END) AS RNC -- Should have break at new statuses
FROM #PatsReturn AS PR
一旦我弄清楚如何在新的状态更改中按日期分配正确的中断,那么我需要找出一种方法来识别(可能是一个新的标志字段??)已经错过3+连续的PATID。 ..
非常感谢任何帮助。
谢谢,
答案 0 :(得分:2)
一种方法就是使用lag()
/ lead()
:
select distinct patid
from (select pr.*,
lead(status) over (partition by patid order by date) as status_1,
lead(status, 2) over (partition by patid order by date) as status_2
from #PatsReturn pr
) pr
where status = 'Missed' and status_1 = 'Missed' and status_2 = 'Missed';
如果您需要有关每个序列的信息,可以识别类似约会的组。一种方法是行数的差异:
select patid, count(*) as numMissed, min(date), max(date)
from (select pr.*,
row_number() over (partition by patid order by date) as seqnum_p,
row_number() over (partition by patid, status order by date) as seqnum_ps
from #PatsReturn pr
) pr
where status = 'Missed'
group by patid, (seqnum_p - seqnum_ps)
having count(*) >= 3;
要了解其工作原理,请运行子查询,您可以看到两个&#34; seqnum&#34;对于相同状态的序列,值是常量。
答案 1 :(得分:0)
如果你选择戈登的第一个解决方案,你会想要一个POC指数。说这是你的桌子:
CREATE TABLE #PatsReturn (PATID char(4), DEPT char(4), [date] date, [status] varchar(10));
你会想要这个索引:
CREATE UNIQUE INDEX nc_PatsReturn ON #PatsReturn(PATID, [date]) INCLUDE ([status]);
这将确保一个不错的线性无排序查询执行计划。索引的第二个解决方案有点棘手。