无论有无订购依据,SQL Server都会产生不同的结果

时间:2019-07-03 02:24:02

标签: sql-server

我正在尝试将ROW_NUMBER包含在ORDER BY中,但无法正常工作。我尝试使用ORDER BY(没有ROW_NUMBER),使用和不使用ORDER BY的结果都不同(甚至行数也不同)。

这是完整的查询(我知道这不是最好的查询):

WITH cte1
AS
(   
    SELECT t1.OrderNo, t1.BlockID, t1.PcbID AS 'TopPcbID', t3.PcbID AS 'MountedOn', 
        t3.TimeDone AS TimeEnd, t9.TimeDone, t7.McID, t8.DeviceID, t8.Program, t6.CurMcID, t9.DeviceID AS D1, t9.Program AS P1,
        ROW_NUMBER() OVER(PARTITION BY t3.PcbID, t7.McID ORDER BY t9.TimeDone DESC) RN
    FROM PanelBlockTrace t1
        INNER JOIN (SELECT PcbID, MIN(BlockNo) AS 'MINIM' FROM PanelBlockTrace
            GROUP BY PcbID) t2 ON t1.PcbID = t2.PcbID
        INNER JOIN PcbTrace t3 ON t1.PcbID = t3.PcbID OR 
            (CASE WHEN t1.BlockNo = 0 THEN t1.BlockID ELSE t1.PcbID END) = t3.PcbID
        INNER JOIN (SELECT PcbID, MAX(McID) AS 'MAXIM' FROM PcbTrace
            WHERE Program NOT LIKE 'PANEL%' GROUP BY PcbID) t4 ON t3.PcbID = t4.PcbID
        INNER JOIN LineDesc t5 ON t3.McID = t5.McID
        INNER JOIN (SELECT LineID, MAX(McID) AS 'CurMcID' FROM LineDesc WHERE McID NOT LIKE '%9' GROUP BY LineID) t6 ON t5.LineID = t6.LineID
        INNER JOIN LineDesc t7 ON t5.LineID = t7.LineID
        LEFT JOIN PcbTrace t8 ON t3.PcbID = t8.PcbID AND t7.McID = t8.McID
        LEFT JOIN PcbTrace t9 ON t8.DeviceID IS NULL 
            AND t9.TimeDone BETWEEN DATEADD(DAY,-1,CONVERT(date, t3.TimeDone)) AND t3.TimeDone
            AND t9.McID = CurMcID AND t9.McID = t7.McID
    WHERE (t1.BlockID IN (...) OR t1.PcbID IN (...))
        AND t1.BlockNo = t2.MINIM
        AND t1.BlockID != t1.PcbID
        AND t1.PcbID != ''
        AND t3.Program NOT LIKE 'PANEL%'
        AND t3.McID = t4.MAXIM
),
cte11
AS
(
    SELECT * FROM cte1
    WHERE RN <= 3
),
cte12
AS
(
    SELECT t1.OrderNo, t1.BlockID, t1.TopPcbID, t1.MountedOn, t1.TimeEnd, LEAD(t1.TimeDone,2) OVER(ORDER BY t1.MountedOn) AS T1, 
        t1.McID, t1.DeviceID, t1.Program, t1.D1, t1.P1, t2.CurMcID, t1.RN
    FROM cte11 t1
        INNER JOIN
            (SELECT t1.MountedOn,t1.McID,t1.P1,t1.Program,MAX(RN) AS LastRec, t1.CurMcID-1 AS CurMcID 
            FROM cte11 t1
                LEFT JOIN (SELECT MountedOn,McID,Program,P1 FROM cte11 WHERE RN = 1) t2 ON
                    t1.MountedOn = t2.MountedOn AND t1.McID = t2.McID AND (t1.Program = t2.Program OR t1.P1 = t2.P1)
            WHERE RN <= 3
            GROUP BY t1.MountedOn, t1.McID, t1.P1, t1.Program, t1.CurMcID) t2 ON 
                t1.MountedOn = t2.MountedOn AND t1.McID = t2.McID 
                AND (t1.RN = t2.LastRec OR t1.RN = 1)

),
cte2
AS
(
    SELECT t1.*,t2.DeviceID AS D2, t2.Program AS P2, t2.TimeDone AS T2
    FROM cte12 t1
        LEFT JOIN PcbTrace t2 ON t1.DeviceID IS NULL AND t1.D1 IS NULL AND t1.McID = t1.CurMcID AND t1.McID = t2.McID
            AND t2.TimeDone BETWEEN DATEADD(DAY,-1,CONVERT(date, t1.T1)) AND t1.T1 

)
SELECT * FROM cte2

无论如何,这就是事情变得奇怪的地方。

最终目标是包括ROW_NUMBER:

cte2
AS
(
    SELECT t1.*,t2.DeviceID AS D2, t2.Program AS P2, t2.TimeDone AS T2, 
        ROW_NUMBER() OVER (PARTITION BY MountedOn, t1.McID ORDER BY t2.TimeDone) RN2
    FROM cte12 t1
        LEFT JOIN PcbTrace t2 ON t1.DeviceID IS NULL AND t1.D1 IS NULL AND t1.McID = t1.CurMcID AND t1.McID = t2.McID
            AND t2.TimeDone BETWEEN DATEADD(DAY,-1,CONVERT(date, t1.T1)) AND t1.T1 

)
SELECT * FROM cte2

由于添加ROW_NUMBER会产生完全不同的结果(没有ROW_NUMBER:3762行,而ROW_NUMBER:17行),我尝试简化最后一部分,并发现ORDER BY是原因。

cte2
AS
(
    SELECT t2.DeviceID AS Device2, t2.Program AS Program2, t2.TimeDone AS Time2
    FROM cte12 t1
        LEFT JOIN PcbTrace t2 ON t1.DeviceID IS NULL AND t1.D1 IS NULL AND t1.McID = t1.CurMcID AND t1.McID = t2.McID
            AND t2.TimeDone BETWEEN DATEADD(DAY,-1,CONVERT(date, t1.T1)) AND t1.T1 

)
SELECT * FROM cte2

不带ORDER BY的结果( 3762 行中的前10个):

Device2 Program2    Time2
NULL    NULL    NULL
NULL    NULL    NULL
NULL    NULL    NULL
NULL    NULL    NULL
NULL    NULL    NULL
NULL    NULL    NULL
NULL    NULL    NULL
1557852877  G8542G004MPB_4M3_00 2019-05-15 00:01:59.777
1557852877  G8542G004MPB_4M3_00 2019-05-15 00:04:56.790
1557852877  G8542G004MPB_4M3_00 2019-05-15 00:05:42.843

带有ORDER BY的代码:

cte2
AS
(
    SELECT t2.DeviceID AS Device2, t2.Program AS Program2, t2.TimeDone AS Time2
    FROM cte12 t1
        LEFT JOIN PcbTrace t2 ON t1.DeviceID IS NULL AND t1.D1 IS NULL AND t1.McID = t1.CurMcID AND t1.McID = t2.McID
            AND t2.TimeDone BETWEEN DATEADD(DAY,-1,CONVERT(date, t1.T1)) AND t1.T1 

)
SELECT * FROM cte2
ORDER BY Time2

结果( 17 行中的前10个):

Device2 Program2    Time2
NULL    NULL    NULL
NULL    NULL    NULL
NULL    NULL    NULL
NULL    NULL    NULL
NULL    NULL    NULL
NULL    NULL    NULL
NULL    NULL    NULL
NULL    NULL    NULL
NULL    NULL    NULL
NULL    NULL    NULL

注意:

  1. 此cte实际上是第四cte,并且正在使用另一个cte(cte12)的结果。我不确定这会如何影响结果。
  2. 没有ddl,因为我要从现有数据库中提取数据,并且模拟这么多表太复杂了。
  3. 是的,最后两个查询之间的唯一区别是ORDER BY子句,但返回的行数却有所不同(3762与17)

编辑:

  1. 如果仅使用TOP(例如TOP 10000),结果将达到预期,共3762行。但是,如果TOP数太大(超过TOP 27415),它将恢复为17行
  2. 如果我将时间限制从

    更改,它可以按预期的方式工作(3762行)

    t2.TimeDone在DATEADD(DAY,-1,CONVERT(date,t1.T1))和t1.T1之间

    t2.TimeDone在DATEADD(DAY,-1,t1.T1)和t1.T1之间

1 个答案:

答案 0 :(得分:0)

如果未指定ORDER BY,则SQL Server可以自由地以方便的任何顺序返回结果。从理论上讲,您可以运行相同的查询数十次,每次都可能产生不同的顺序。

但是除非您使用TOP,否则它将始终返回相同的行。


查看整个查询后的更多内容:

在您的第一个CTE中,我们看到

ROW_NUMBER() OVER(PARTITION BY ... ORDER BY t9.TimeDone DESC) RN

在第二个CTE中,我们看到了

   SELECT * FROM cte1
    WHERE RN <= 3

现在想象一下SQL Server正在为第一个CTE生成结果。首先要求按TimeDone desc对这些值进行排序,然后将行号分配给结果。考虑存在多个具有相同TimeDone值的行的情况。它必须将具有较高TimeDone的值放在较低的值之前,但是对于具有相同值的行,可以按方便的任何顺序放置这些行。然后,您仅过滤出前3行。 (这是一种与TOP相同的棘手方法!)。

假设您有一个这样的表:

ID  Time 
A   05:00
B   04:00
C   03:00
D   03:00
E   02:00
F   01:00

您要求SQL Server按Time desc进行排序,并为每个行分配一个行号。有两种可能的结果。

A  05:00  1
B  04:00  2
C  03:00  3
D  03:00  4
E  03:00  5
F  01:00  6

A  05:00  1
B  04:00  2
D  03:00  3        // <-- C and D
C  03:00  4        //     are swapped
E  03:00  5
F  01:00  6

两者都符合将时间的较高值放在较低值之前的规则。如果值相同,则取决于SQL Server的方便选择。

但是,如果您在RN <= 3上过滤这些结果,则会得到两个不同的结果。

如果您随后获得这两个结果并将其输入到越来越复杂的CTE中,您可能会得到完全不同的答案。

每次通过添加具有不同值的TOP或ORDER BY更改整体查询时,SQL Server可能会生成完全不同的执行计划来查找结果。

即使您从未更改查询或数据,它也会根据其他环境因素产生不同的结果。