Sql server错误?按表达式分组时查询结果不确定?

时间:2013-11-22 23:02:12

标签: sql-server sql-server-2008

我有以下查询

with cte1 as (
    select isnull(A, 'Unknown') as A,
           isnull(nullif(B, 'NULL'), 'Unknown') as B,
           C
    from   ... -- uses collate SQL_Latin1_General_CP1_CI_AS when joining 
    group by isnull(A, 'Unknown'), isnull(nullif(B, 'NULL'), 'Unknown'), C
    ),
    cte2 as (select top (2147483647) A, B, C from cte1 order by A, B, C),
      -- Removing cte2 makes it work if running directly as SQL query. However, 
      -- it still behave the same if the code is in view or table function 
    ctes as (
    .... -- pretty complex query joining cte2 multiple times
         -- uses row_number(), ntile
    )
    select count(*) from finalCTE

结果(计数)每次更改执行时。而且它远远低于应该的数量。我发现以下任何一个步骤都可以做到。

  1. 实现(临时或永久表)CTE cte1并改为使用物化表。
  2. cte1中的组更改为以下表单中的任何
    • group by A, isnull(nullif(B, 'NULL'), 'Unknown'), C
    • group by isnull(A, 'Unknown'), nullif(B, 'NULL'), C
    • group by A, nullif(B, 'NULL'), C
    • 在其他CTE中使用cte1代替cte2。 (更新:此步骤并不总是有效。当它在表函数中时仍有问题,但如果直接运行SQL则可以正常工作)
  3. 但是,为什么原始查询表现得很奇怪?这是SQL Server中的错误吗?

    全功能代码:

    ALTER function [dbo].[fn] (@para1 char(3))
    returns table
    return
    with    cte1 as ( select AAA, BBB, CCC
                   from     dbo.fnBBB(12)
                   where    @para1 = 'xxxx'
                   union all
                   select   AAA, BBB, CCC
                   from     dbo.fnBBB2(12)
                   where    @para1 = 'yyyy'
                 ),
            -- Tested not using cte2, the same behave
            cte2 as (select top (2147483647) AAA, BBB, CCC from cte1 order by AAA, BBB, CCC),
            t as ( select   e.CCC, e.value1, cte2.BBB, cte2.AAA
                   from     dbo.T1 e
                            join cte2 on e.CCC = cte2.CCC
                 ),
            b as ( select   BBB, AAA, count(*) count, 
                            case when count(*) / 5 > 10 then 10 
                                 else count(*) / 5 
                            end as buckets
                   from     t 
                   group by BBB, AAA 
                   having   count(*) >= 5 
                 ),
            b2
              as ( select   t.*
                   from     b
                            cross apply ( select    *,
                                                    ntile(b.buckets) over ( partition by t.BBB, t.AAA order by value1, CCC )
                                                    as bucket
                                          from      t
                                          where     BBB = b.BBB
                                                    and AAA = b.AAA
                                        ) t
                 ),
            m1
              as ( select   AAA, BBB, b2.CCC, Date, SId, value2, b2.bucket, --
                            _asc = row_number() over ( partition by BBB, AAA, bucket, Date, SId order by value2, b2.CCC ),
                            _desc = row_number() over ( partition by BBB, AAA, bucket, Date, SId order by value2 desc, b2.CCC desc )
                            ,count(*) over (partition by BBB, AAA, bucket, Date, SId) scount
                   from     b2 join dbo.T2 e on b2.CCC = e.CCC
                 ),
            median
              as ( select   BBB, AAA, bucket, Date, SId, avg(value2) value2Median, min(scount) sCount
                   from     m1
                   where    _asc in ( _desc, _desc - 1, _desc + 1 )
                   group by BBB, AAA, bucket, Date, SId
                 ),
            bounds
              as ( select   BBB, AAA, bucket, min(value1) dboMin, max(value1) value1Max, count(*) count 
                   from     b2
                   group by BBB, AAA, bucket 
                 )
        select  m.*, b.dboMin, b.value1Max, Count
        from    median m join bounds b on m.BBB = b.BBB and m.AAA = b.AAA and m.bucket = b.bucket 
        -- order by BBB, AAA, bucket 
    

    cte1中使用的函数:

    CREATE function [dbo].[fnBBB](@param int) 
    returns table
    return
    with    m as ( select   * -- only this view has non default collate (..._CS_AS)
                   from     dbo.view1 -- indxed view. 
                 )
        select  isnull(g.AAA, 'Unknown') as AAA,
                isnull(nullif(m1.value, 'NULL'), 'Unknown') as BBB
                , m.CCC
        from    m 
                left join dbo.mapping m0 on m0.id = 12
                    and m0.value = m. v1 collate SQL_Latin1_General_CP1_CI_AS
                left join dbo.map1 r on r.Country = m0.value
                left join dbo.map2 g on g.N = r.N
                left join dbo.mapping m1 on m1.id = 20
                    and m1.value = m.v2 collate SQL_Latin1_General_CP1_CI_AS
        where   m.run_date > dateadd(mm, -@param, getdate())
        group by isnull(g.AAA, 'Unknown'), isnull(nullif(m1.value, 'NULL'), 'Unknown'), m.CCC
    

1 个答案:

答案 0 :(得分:2)

SQL是一种基于集合的语言。在这个范例中,返回的行的顺序通常是不相关的。您可以将无序视为默认行为。当您真正想要排序行时,您需要在查询中的某处显式使用ORDER BY,以指定如何订购。

对于正常的无序查询,查询返回的实际行顺序可能由许多事情决定。例如,磁盘上行的物理布局,查询优化器实际用于返回行的索引的索引节点的顺序,查询计划步骤的实际执行顺序等 - 大多数在执行时决定,甚至可能在后续执行之间变化

如果这是您所观察到的,那么这不是一个错误,而是所有关系数据库引擎中的基本和正常行为。