Question

我们有如下所示的约会表。每个约会都需要分类为“新”或“后续”。在首次预约（该病人）后30天内（针对该病人）的任何预约都是随访。 30天后，约会再次为“新”。 30天之内的任何约会都将成为“后续活动”。

我目前正在通过键入while循环来实现此功能。
如何在没有WHILE循环的情况下实现这一目标？

表

CREATE TABLE #Appt1 (ApptID INT, PatientID INT, ApptDate DATE)
INSERT INTO #Appt1
SELECT  1,101,'2020-01-05' UNION
SELECT  2,505,'2020-01-06' UNION
SELECT  3,505,'2020-01-10' UNION
SELECT  4,505,'2020-01-20' UNION
SELECT  5,101,'2020-01-25' UNION
SELECT  6,101,'2020-02-12'  UNION
SELECT  7,101,'2020-02-20'  UNION
SELECT  8,101,'2020-03-30'  UNION
SELECT  9,303,'2020-01-28' UNION
SELECT  10,303,'2020-02-02'

Answer 1

您需要使用递归查询。

从上一个开始计算30天的期限（没有递归/古怪的更新/循环，是不可能的）。这就是为什么仅使用ROW_NUMBER的所有现有答案都失败的原因。

WITH f AS (
  SELECT *, rn = ROW_NUMBER() OVER(PARTITION BY PatientId ORDER BY ApptDate) 
  FROM Appt1
), rec AS (
  SELECT Category = CAST('New' AS NVARCHAR(20)), ApptId, PatientId, ApptDate, rn, startDate = ApptDate
  FROM f
  WHERE rn = 1
  UNION ALL
  SELECT CAST(CASE WHEN DATEDIFF(DAY,  rec.startDate,f.ApptDate) <= 30 THEN N'FollowUp' ELSE N'New' END AS NVARCHAR(20)), 
         f.ApptId,f.PatientId,f.ApptDate, f.rn,
         CASE WHEN DATEDIFF(DAY, rec.startDate, f.ApptDate) <= 30 THEN rec.startDate ELSE f.ApptDate END
  FROM rec
  JOIN f
    ON rec.rn = f.rn - 1
   AND rec.PatientId = f.PatientId
)
SELECT ApptId, PatientId, ApptDate, Category
FROM rec
ORDER BY PatientId, ApptDate;

db<>fiddle demo

输出：

+---------+------------+-------------+----------+
| ApptId  | PatientId  |  ApptDate   | Category |
+---------+------------+-------------+----------+
|      1  |       101  | 2020-01-05  | New      |
|      5  |       101  | 2020-01-25  | FollowUp |
|      6  |       101  | 2020-02-12  | New      |
|      7  |       101  | 2020-02-20  | FollowUp |
|      8  |       101  | 2020-03-30  | New      |
|      9  |       303  | 2020-01-28  | New      |
|     10  |       303  | 2020-02-02  | FollowUp |
|      2  |       505  | 2020-01-06  | New      |
|      3  |       505  | 2020-01-10  | FollowUp |
|      4  |       505  | 2020-01-20  | FollowUp |
+---------+------------+-------------+----------+

工作原理：

f-获取起点（锚点-每个PatientId）
rec-递归部分，如果当前值与prev之差> 30，则在PatientId上下文中更改类别和起点。
主要-显示排序的结果集

类似的课程：

Conditional SUM on Oracle-封闭窗口函数

Session window (Azure Stream Analytics)

Running Total until specific condition is true-古怪的更新

附录

请勿在生产环境中使用此代码！

但是，除了使用cte之外，还有一个值得一提的选择是使用临时表并在“回合”中更新

可以在“单”轮中完成（更新）：

CREATE TABLE Appt_temp (ApptID INT , PatientID INT, ApptDate DATE, Category NVARCHAR(10))

INSERT INTO Appt_temp(ApptId, PatientId, ApptDate)
SELECT ApptId, PatientId, ApptDate
FROM Appt1;

CREATE CLUSTERED INDEX Idx_appt ON Appt_temp(PatientID, ApptDate);

查询：

DECLARE @PatientId INT = 0,
        @PrevPatientId INT,
        @FirstApptDate DATE = NULL;

UPDATE Appt_temp
SET  @PrevPatientId = @PatientId
    ,@PatientId     = PatientID 
    ,@FirstApptDate = CASE WHEN @PrevPatientId <> @PatientId THEN ApptDate
                           WHEN DATEDIFF(DAY, @FirstApptDate, ApptDate)>30 THEN ApptDate
                           ELSE @FirstApptDate
                      END
    ,Category       = CASE WHEN @PrevPatientId <> @PatientId THEN 'New'
                           WHEN @FirstApptDate = ApptDate THEN 'New'
                           ELSE 'FollowUp' 
                      END
FROM Appt_temp WITH(INDEX(Idx_appt))
OPTION (MAXDOP 1);

SELECT * FROM  Appt_temp ORDER BY PatientId, ApptDate;

db<>fiddle Quirky update

Answer 2

您可以使用递归cte进行此操作。您应该首先在每个患者内按apptDate订购。这可以通过常规CTE来完成。

然后，在递归cte的锚点中，为每个患者选择第一个顺序，将状态标记为“新”，还将apptDate标记为最新“新”记录的日期。

在递归cte的递归部分中，递增到下一个约会，计算当前约会与最近的“新”约会日期之间的天数差。如果超过30天，则将其标记为“新”并重置最近的新约会日期。否则，将其标记为“跟进”，并沿用新约会日期以来的现有日期。

最后，在基本查询中，只需选择所需的列即可。

with orderings as (

    select       *, 
                 rn = row_number() over(
                     partition by patientId 
                     order by apptDate
                 ) 
    from         #appt1 a

),

markings as (

    select       apptId, 
                 patientId, 
                 apptDate, 
                 rn, 
                 type = convert(varchar(10),'new'),
                 dateOfNew = apptDate
    from         orderings 
    where        rn = 1

    union all
    select       o.apptId, o.patientId, o.apptDate, o.rn,
                 type = convert(varchar(10),iif(ap.daysSinceNew > 30, 'new', 'follow up')),
                 dateOfNew = iif(ap.daysSinceNew > 30, o.apptDate, m.dateOfNew)
    from         markings m
    join         orderings o 
                     on m.patientId = o.patientId 
                     and m.rn + 1 = o.rn
    cross apply  (select daysSinceNew = datediff(day, m.dateOfNew, o.apptDate)) ap

)

select    apptId, patientId, apptDate, type
from      markings
order by  patientId, rn;

我应该提一下，我最初删除了此答案，因为Abhijeet Khandagale的答案似乎可以通过更简单的查询满足您的需求（稍作修改后）。但是，在您向他发表有关您的业务需求和添加的示例数据的评论时，我取消删除了我的报告，因为相信这可以满足您的需求。

Answer 3

我不确定这正是您实现的。但是，除了使用cte之外，还有一个值得一提的选择是使用临时表并在“回合”中进行更新。因此，我们将在所有状态未正确设置的情况下更新临时表，并以迭代方式生成结果。我们可以只使用局部变量来控制迭代次数。

因此，我们将每个迭代分为两个阶段。

设置“新记录”附近的所有“跟进”值。仅使用正确的过滤器就很容易做到。
对于其余未设置状态的记录，我们可以选择具有相同PatientID的组中的第一个。并说它们是新的，因为它们没有在第一阶段进行处理。

所以

CREATE TABLE #Appt2 (ApptID INT, PatientID INT, ApptDate DATE, AppStatus nvarchar(100))

select * from #Appt1
insert into #Appt2 (ApptID, PatientID, ApptDate, AppStatus)
select a1.ApptID, a1.PatientID, a1.ApptDate, null from #Appt1 a1
declare @limit int = 0;

while (exists(select * from #Appt2 where AppStatus IS NULL) and @limit < 1000)
begin
  set @limit = @limit+1;
  update a2
  set
    a2.AppStatus = IIF(exists(
        select * 
        from #Appt2 a 
        where 
          0 > DATEDIFF(day, a2.ApptDate, a.ApptDate) 
          and DATEDIFF(day, a2.ApptDate, a.ApptDate) > -30 
          and a.ApptID != a2.ApptID 
          and a.PatientID = a2.PatientID
          and a.AppStatus = 'New'
          ), 'Followup', a2.AppStatus)
  from #Appt2 a2

  --select * from #Appt2

  update a2
  set a2.AppStatus = 'New'
  from #Appt2 a2 join (select a.*, ROW_NUMBER() over (Partition By PatientId order by ApptId) rn from (select * from #Appt2 where AppStatus IS NULL) a) ar
  on a2.ApptID = ar.ApptID
  and ar.rn = 1

  --select * from #Appt2

end

select * from #Appt2 order by PatientID, ApptDate

drop table #Appt1
drop table #Appt2

更新。阅读Lukasz提供的评论。到目前为止，这是更聪明的方法。我把我的答案只是一个想法。

Answer 4

我相信递归通用表达式是优化查询以避免循环的好方法，但是在某些情况下，它会导致性能下降，因此应尽可能避免。

我使用下面的代码来解决该问题，并测试它会获得更多的价值，但同时也鼓励您使用实际数据对其进行测试。

WITH DataSource AS
(
    SELECT *
          ,CEILING(DATEDIFF(DAY, MIN([ApptDate]) OVER (PARTITION BY [PatientID]), [ApptDate]) * 1.0 / 30 + 0.000001) AS [GroupID]
    FROM #Appt1
)
SELECT *
     ,IIF(ROW_NUMBER() OVER (PARTITION BY [PatientID], [GroupID] ORDER BY [ApptDate]) = 1, 'New', 'Followup')
FROM DataSource
ORDER BY [PatientID]
        ,[ApptDate];

这个想法很简单-我想将记录分组（30天），其中最小的记录是new，其他记录是follow ups。检查语句的构建方式：

SELECT *
      ,DATEDIFF(DAY, MIN([ApptDate]) OVER (PARTITION BY [PatientID]), [ApptDate])
      ,DATEDIFF(DAY, MIN([ApptDate]) OVER (PARTITION BY [PatientID]), [ApptDate]) * 1.0 / 30
      ,CEILING(DATEDIFF(DAY, MIN([ApptDate]) OVER (PARTITION BY [PatientID]), [ApptDate]) * 1.0 / 30 + 0.000001) 
FROM #Appt1
ORDER BY [PatientID]
        ,[ApptDate];

所以：

首先，我们获取每个组的第一个日期，并计算与当前日期的天数差异
然后，我们要获取组-已添加* 1.0 / 30
至30、60、90等天，我们正在获取整数，我们想开始一个新的时期，我添加了+ 0.000001；另外，我们正在使用吊顶函数来获取smallest integer greater than, or equal to, the specified numeric expression

就是这样。有了这样的组，我们只需使用ROW_NUMBER来找到我们的开始日期，并将其设置为new，其余的保留为follow ups。

Answer 5

尽管问题中并未明确解决，但很容易发现约会日期不能简单地按30天小组进行分类。这没有商业意义。而且您也不能使用appt ID。今天可以为2020-09-06进行新的约会。这是我解决此问题的方法。首先，获取第一个约会，然后计算每个约会和第一个约会之间的日期差。如果为0，则设置为“新建”。如果<= 30'跟进'。如果> 30，则设置为“未定”并进行下一轮检查，直到不再有“未定”为止。为此，您确实需要一个while循环，但是它不会循环每个约会日期，而只是循环几个数据集。我检查了执行计划。即使只有10行，查询成本也明显低于使用递归CTE的查询成本，但不及Lukasz Szozda的附录方法低。

IF OBJECT_ID('tempdb..#TEMPTABLE') IS NOT NULL DROP TABLE #TEMPTABLE
SELECT ApptID, PatientID, ApptDate
    ,CASE WHEN (DATEDIFF(DAY, MIN(ApptDate) OVER (PARTITION BY PatientID), ApptDate) = 0) THEN 'New' 
    WHEN (DATEDIFF(DAY, MIN(ApptDate) OVER (PARTITION BY PatientID), ApptDate) <= 30) THEN 'Followup'
    ELSE 'Undecided' END AS Category
INTO #TEMPTABLE
FROM #Appt1

WHILE EXISTS(SELECT TOP 1 * FROM #TEMPTABLE WHERE Category = 'Undecided') BEGIN
    ;WITH CTE AS (
        SELECT ApptID, PatientID, ApptDate 
            ,CASE WHEN (DATEDIFF(DAY, MIN(ApptDate) OVER (PARTITION BY PatientID), ApptDate) = 0) THEN 'New' 
            WHEN (DATEDIFF(DAY, MIN(ApptDate) OVER (PARTITION BY PatientID), ApptDate) <= 30) THEN 'Followup'
            ELSE 'Undecided' END AS Category    
        FROM #TEMPTABLE
        WHERE Category = 'Undecided'
    )
    UPDATE #TEMPTABLE
    SET Category = CTE.Category
    FROM #TEMPTABLE t
        LEFT JOIN CTE ON CTE.ApptID = t.ApptID
    WHERE t.Category = 'Undecided'
END

SELECT ApptID, PatientID, ApptDate, Category 
FROM #TEMPTABLE

Answer 6

在所有人中以及在恕我直言中，

There is not much difference between While LOOP and Recursive CTE in terms of RBAR

同时使用Recursive CTE和Window Partition function时，性能没有太大提高。

Appid应该是int identity(1,1)，或者应该不断增加clustered index。

除了其他好处，它还确保该患者的所有连续行APPDate必须更大。

通过这种方式，您可以轻松地在查询中使用APPID，这比将inequality运算符（例如>，<）放入APPDate中更为有效。将inequality运算符（例如>，<）放入APPID中将有助于Sql Optimizer。

表中也应该有两个日期列

APPDateTime datetime2(0) not null,
Appdate date not null

由于这些是最重要的表中最重要的列，因此无需大量转换，转换。

因此Non clustered index可以在Appdate上创建

Create NonClustered index ix_PID_AppDate_App  on APP (patientid,APPDate) include(other column which is not i predicate except APPID)

使用其他示例数据和lemme测试我的脚本，以了解对于哪些示例数据不起作用。即使它不起作用，我也可以在脚本逻辑本身中修复它。

CREATE TABLE #Appt1 (ApptID INT, PatientID INT, ApptDate DATE)
INSERT INTO #Appt1
SELECT  1,101,'2020-01-05'  UNION ALL
SELECT  2,505,'2020-01-06'  UNION ALL
SELECT  3,505,'2020-01-10'  UNION ALL
SELECT  4,505,'2020-01-20'  UNION ALL
SELECT  5,101,'2020-01-25'  UNION ALL
SELECT  6,101,'2020-02-12'  UNION ALL
SELECT  7,101,'2020-02-20'  UNION ALL
SELECT  8,101,'2020-03-30'  UNION ALL
SELECT  9,303,'2020-01-28'  UNION ALL
SELECT  10,303,'2020-02-02' 

;With CTE as
(
select a1.* ,a2.ApptDate as NewApptDate
from #Appt1 a1
outer apply(select top 1 a2.ApptID ,a2.ApptDate
from #Appt1 A2 
where a1.PatientID=a2.PatientID and a1.ApptID>a2.ApptID 
and DATEDIFF(day,a2.ApptDate, a1.ApptDate)>30
order by a2.ApptID desc )A2
)
,CTE1 as
(
select a1.*, a2.ApptDate as FollowApptDate
from CTE A1
outer apply(select top 1 a2.ApptID ,a2.ApptDate
from #Appt1 A2 
where a1.PatientID=a2.PatientID and a1.ApptID>a2.ApptID 
and DATEDIFF(day,a2.ApptDate, a1.ApptDate)<=30
order by a2.ApptID desc )A2
)
select  * 
,case when FollowApptDate is null then 'New' 
when NewApptDate is not null and FollowApptDate is not null 
and DATEDIFF(day,NewApptDate, FollowApptDate)<=30 then 'New'
else 'Followup' end
 as Category
from cte1 a1
order by a1.PatientID

drop table #Appt1

Answer 7

希望这会对您有所帮助。

WITH CTE AS
(
    SELECT #Appt1.*, RowNum = ROW_NUMBER() OVER (PARTITION BY PatientID ORDER BY ApptDate, ApptID) FROM #Appt1
)

SELECT A.ApptID , A.PatientID , A.ApptDate ,
Expected_Category = CASE WHEN (DATEDIFF(MONTH, B.ApptDate, A.ApptDate) > 0) THEN 'New' 
WHEN (DATEDIFF(DAY, B.ApptDate, A.ApptDate) <= 30) then 'Followup' 
ELSE 'New' END
FROM CTE A
LEFT OUTER JOIN CTE B on A.PatientID = B.PatientID 
AND A.rownum = B.rownum + 1
ORDER BY A.PatientID, A.ApptDate

Answer 8

您可以使用analyze API。

120cm

问题是，应该根据初始任命还是之前的任命来分配此类别？也就是说，如果一个病人有3个约会，我们应该将第三个约会与第一个约会进行比较吗？

您的问题陈述了第一个，这就是我的回答方式。如果不是这种情况，则需要使用select *, CASE WHEN DATEDIFF(d,A1.ApptDate,A2.ApptDate)>30 THEN 'New' ELSE 'FollowUp' END 'Category' from (SELECT PatientId, MIN(ApptId) 'ApptId', MIN(ApptDate) 'ApptDate' FROM #Appt1 GROUP BY PatientID) A1, #Appt1 A2 where A1.PatientID=A2.PatientID AND A1.ApptID<A2.ApptID。

此外，请记住，lag在周末也不例外。如果这仅是工作日，则需要创建自己的标量值函数。

Answer 9

使用滞后功能


select  apptID, PatientID , Apptdate ,  
    case when date_diff IS NULL THEN 'NEW' 
         when date_diff < 30 and (date_diff_2 IS NULL or date_diff_2 < 30) THEN  'Follow Up'
         ELSE 'NEW'
    END AS STATUS FROM 
(
select 
apptID, PatientID , Apptdate , 
DATEDIFF (day,lag(Apptdate) over (PARTITION BY PatientID order by ApptID asc),Apptdate) date_diff ,
DATEDIFF(day,lag(Apptdate,2) over (PARTITION BY PatientID order by ApptID asc),Apptdate) date_diff_2
  from #Appt1
) SRC

演示-> https://rextester.com/TNW43808

Answer 10

with cte
as
(
select 
tmp.*, 
IsNull(Lag(ApptDate) Over (partition by PatientID Order by  PatientID,ApptDate),ApptDate) PriorApptDate
 from #Appt1 tmp
)
select 
PatientID, 
ApptDate, 
PriorApptDate, 
DateDiff(d,PriorApptDate,ApptDate) Elapsed,
Case when DateDiff(d,PriorApptDate,ApptDate)>30 
or DateDiff(d,PriorApptDate,ApptDate)=0 then 'New' else 'Followup'   end Category   from cte

我的说法是正确的。作者不正确，请参见

无WHILE循环查询

10 个答案:

附录