row_number,分区值改变

时间:2016-09-21 15:11:53

标签: sql sql-server

我有这段sql代码根据表__working中的一些值找到row_number,这些值连接到查找表__Eval

;WITH r 
 AS (
    select 

    w.uid,
    t.TypeId,
    --weight
    ROW_NUMBER () OVER (PARTITION BY w.uid ORDER BY  DIFFERENCE(t.val1, w.val1) +  DIFFERENCE(t.val2, w.val2) + DIFFERENCE(t.val3, w.val3) + DIFFERENCE(t.val4, w.val4)  DESC) as Score
    ,w.account

    from __Working w
        join __eval t on w.val1 like t.val1 and IsNull(w.val4, '') like t.val4 and IsNull(w.val2, '') like t.val2 and IsNull(w.val3, '') like t.val3

)

select * from r     where   r.account = 1 and score = 1

返回typeId = 1

但是如果我这样写的话

   ;WITH r 
     AS (
        select 

        w.uid,
        t.TypeId,
        --weight
        ROW_NUMBER () OVER (PARTITION BY w.uid ORDER BY  DIFFERENCE(t.val1, w.val1) +  DIFFERENCE(t.val2, w.val2) + DIFFERENCE(t.val3, w.val3) + DIFFERENCE(t.val4, w.val4)  DESC) as Score
        ,w.account

        from __Working w
            join __eval t on w.val1 like t.val1 and IsNull(w.val4, '') like t.val4 and IsNull(w.val2, '') like t.val2 and IsNull(w.val3, '') like t.val3
            where   r.account = 1
    )

    select * from r     where   r.account = 1 and score = 1

它返回TypeId = 2.我希望如果我在__working中的不同帐户中有多个UID,但我不会。我在这里缺少什么?

1 个答案:

答案 0 :(得分:2)

哦,这是不稳定种类的奇怪之处。您的row_number()表达式为:

 ROW_NUMBER() OVER (PARTITION BY w.uid
                    ORDER BY  DIFFERENCE(t.val1, w.val1) +
                              DIFFERENCE(t.val2, w.val2) +
                              DIFFERENCE(t.val3, w.val3) +
                              DIFFERENCE(t.val4, w.val4)  DESC
                   ) as Score

问题是多行具有ORDER BY键的相同值。不同的调用任意选择这些多行中的哪一行是第一个,第二个,依此类推。

规范解决方案是包含某种唯一键,以便排序稳定:

 ROW_NUMBER() OVER (PARTITION BY w.uid
                    ORDER BY  (DIFFERENCE(t.val1, w.val1) +
                               DIFFERENCE(t.val2, w.val2) +
                               DIFFERENCE(t.val3, w.val3) +
                               DIFFERENCE(t.val4, w.val4)
                              ) DESC,
                              ??  -- perhaps typeId
                   ) as Score

但是,我可能会建议一个更加艰巨的解决方案。接受可能存在关联的事实,并使用rank()dense_rank()来识别它们。然后,明确明确在平局的情况下该做什么 - 或许所有这些对你来说同样有趣,或者你可能有其他一些打破关系的方法。