SQL游标替代

时间:2018-11-30 13:14:40

标签: sql sql-server cursor

需要一些改进/重写以下查询的建议。总而言之,我有一张表,我试图递归地遍历该表以生成父子关系。

例如,该表具有:

PAT_ID 1

+-------+------------------+------------------+
| EP_ID |    START_DTTM    |     END_DTTM     |
+-------+------------------+------------------+
|     1 | 01/12/2018 10:00 | 02/12/2018 15:00 |
|     2 | 03/12/2018 10:00 | 10/12/2018 15:00 |
|     3 | 04/12/2018 10:00 | 06/12/2018 15:00 |
|     4 | 07/12/2018 10:00 | 09/12/2018 15:00 |
|     5 | 11/12/2018 10:00 | 13/12/2018 15:00 |
|     6 | 12/12/2018 10:00 | 12/12/2018 15:00 |
|     7 | 01/12/2019 10:00 | 02/12/2019 15:00 |
+-------+------------------+------------------+

所需的输出:

+--------+-------+-----------+-----------------------------------------------------------------------------------------+
| PAT_ID | EP_ID | PARENT_ID |                                        LINK_TYPE                                        |
+--------+-------+-----------+-----------------------------------------------------------------------------------------+
|      1 |     1 |         0 | 'Parent'                                                                                |
|      1 |     2 |         1 | 'Child' (Rule for child is that START_DTTM is less than 24 hours of parent EP_ID)       |
|      1 |     3 |         2 | 'Inner' (Rule for inner is that START_DTTM is between START_DTTM and END_DTTM of Child) |
|      1 |     4 |         2 | 'Inner'                                                                                 |
|      1 |     5 |         0 | 'Parent' (doesnt qualify as child or inner for any row)                                 |
|      1 |     6 |         5 | 'Child'                                                                                 |
|      1 |     7 |         0 | 'Parent                                                                                 |
+--------+-------+-----------+-----------------------------------------------------------------------------------------+

~~~ 我试图使用游标编写逻辑,该游标似乎返回的行很好,但是基表有超过1000万行,因此它不太可能在我退休之前完成,不幸的是还有30年了:)。在如何处理此查询方面需要社区的专家建议(我尝试了while循环,其速度比游标慢)。

谢谢!

IF (OBJECT_ID('tempdb..#PARENT') IS NOT NULL)
BEGIN
    DROP TABLE #PARENT
END

IF (OBJECT_ID('tempdb..#CHILD') IS NOT NULL)
BEGIN
    DROP TABLE #CHILD
END


CREATE TABLE #Parent (
    EP_ID INT
    ,ID VARCHAR(20)
    ,PAT_ID VARCHAR(50)
    ,START_DTTM DATETIME
    ,END_DTTM DATETIME
    ,CT_DESC VARCHAR(100)
    ,CT_CODE VARCHAR(10)
    ,PARENT_EP_ID INT
    ,PARENT_ID VARCHAR(20)
    ,LINK VARCHAR(20)
    ,PROCESSED INT
    ,PARENT_EP_SEQ INT
    )

CREATE TABLE #CHILD (
    EP_ID INT
    ,ID VARCHAR(20)
    ,PAT_ID VARCHAR(50)
    ,START_DTTM DATETIME
    ,END_DTTM DATETIME
    ,CT_DESC VARCHAR(100)
    ,CT_CODE VARCHAR(10)
    ,PARENT_EP_ID INT
    ,PARENT_ID VARCHAR(20)
    ,LINK VARCHAR(20)
    ,PROCESSED INT
    ,CHILD_EP_SEQ INT
    )


INSERT INTO #PARENT
SELECT deip.EP_ID
    ,deip.ID
    ,deip.PAT_ID
    ,START_DTTM
    ,END_DTTM
    ,CT_DESC
    ,CT_CODE
    ,0
    ,''
    ,'Parent' AS LINK
    ,0 AS PROCESSED
    ,row_number() OVER (
        PARTITION BY deip.PAT_ID ORDER BY START_DTTM
        ) AS PARENT_EP_SEQ
FROM dbo.deip
INNER JOIN dbo.dEP ep ON deip.EP_ID = ep.EP_ID
dbo.RE ep.STATUS IN (
        'A'
        ,'D'
        )
    AND ep.RECORD_STATUS = 'A'
    AND
    event_type = 'Active'
    AND CT_CODE <> '10'

PRINT 'Parent Done'

DECLARE @PARENT_EP_SEQ INT
DECLARE @PAT_ID INT
DECLARE @EP_ID INT
DECLARE @COUNT BIGINT

DECLARE ChildCursor CURSOR LOCAL FAST_FORWARD
FOR
SELECT PARENT_EP_SEQ
    ,PAT_ID
    ,EP_ID
FROM #PARENT
where PROCESSED = 0

OPEN ChildCursor

while 1 = 1
BEGIN
    -- And then fetch
    FETCH NEXT
    FROM ChildCursor
    INTO @PARENT_EP_SEQ
        ,@PAT_ID
        ,@EP_ID

    -- And then, if no row is fetched, exit the loop
    IF @@fetch_status <> 0
    BEGIN
        BREAK
    END
    INSERT INTO #CHILD
    SELECT C.EP_ID
        ,C.ID
        ,P.PAT_ID
        ,C.START_DTTM
        ,C.END_DTTM
        ,C.CT_DESC
        ,C.CT_CODE
        ,P.EP_ID AS PARENT_EP_ID
        ,P.ID
        ,'Child' AS LINK
        ,0 AS PROCESSED
        ,row_number() OVER (
            PARTITION BY C.PAT_ID ORDER BY c.START_DTTM
            ) AS CHILD_EP_SEQ
    FROM #PARENT p
    INNER JOIN #PARENT C ON p.PAT_ID = c.PAT_ID
    dbo.RE P.PAT_ID = @PAT_ID
        AND P.EP_ID = @EP_ID
        AND P.PARENT_EP_SEQ = @PARENT_EP_SEQ
        AND P.EP_ID <> C.EP_ID
        AND P.PARENT_EP_SEQ <> C.PARENT_EP_SEQ
        AND datediff(hh, isnull(p.END_DTTM, getdate()), C.START_DTTM) BETWEEN 0
            AND 24
        AND p.PROCESSED = 0
        AND c.CT_CODE <> '10'
    ORDER BY p.PARENT_EP_SEQ

    DELETE P
    FROM #PARENT P
    INNER JOIN #CHILD c ON p.PAT_ID = c.PAT_ID
        AND p.EP_ID = c.EP_ID

    UPDATE #PARENT
    SET Processed = 1
    dbo.RE PAT_ID = @PAT_ID
        AND EP_ID = @EP_ID
        AND PARENT_EP_SEQ = @PARENT_EP_SEQ
END

CLOSE ChildCursor

DEALLOCATE ChildCursor

PRINT 'Child Done'

经过思考:我曾考虑使用递归/分层CTE,但是我没有确定关系的键。父母与孩子的关联就是我想要产生的。

1 个答案:

答案 0 :(得分:0)

您可以对CURSOR方法进行多线程处理,因为这听起来像是一次性的,而不是一遍又一遍地要做的事情。

使用过滤器编辑您的CURSOR代码,该过滤器将在大约50万行上运行,启动它,打开另一个窗口,然后添加一个过滤器,该过滤器将在500,001-1,000,000行上启动,等等。

我敢打赌,它将在您针对此逻辑提出基于CTE /集合的方法之前完成。