Question

我有这段sql代码根据表__working中的一些值找到row_number，这些值连接到查找表__Eval

;WITH r 
 AS (
    select 

    w.uid,
    t.TypeId,
    --weight
    ROW_NUMBER () OVER (PARTITION BY w.uid ORDER BY  DIFFERENCE(t.val1, w.val1) +  DIFFERENCE(t.val2, w.val2) + DIFFERENCE(t.val3, w.val3) + DIFFERENCE(t.val4, w.val4)  DESC) as Score
    ,w.account

    from __Working w
        join __eval t on w.val1 like t.val1 and IsNull(w.val4, '') like t.val4 and IsNull(w.val2, '') like t.val2 and IsNull(w.val3, '') like t.val3

)

select * from r     where   r.account = 1 and score = 1

返回typeId = 1

但是如果我这样写的话

   ;WITH r 
     AS (
        select 

        w.uid,
        t.TypeId,
        --weight
        ROW_NUMBER () OVER (PARTITION BY w.uid ORDER BY  DIFFERENCE(t.val1, w.val1) +  DIFFERENCE(t.val2, w.val2) + DIFFERENCE(t.val3, w.val3) + DIFFERENCE(t.val4, w.val4)  DESC) as Score
        ,w.account

        from __Working w
            join __eval t on w.val1 like t.val1 and IsNull(w.val4, '') like t.val4 and IsNull(w.val2, '') like t.val2 and IsNull(w.val3, '') like t.val3
            where   r.account = 1
    )

    select * from r     where   r.account = 1 and score = 1

它返回TypeId = 2.我希望如果我在__working中的不同帐户中有多个UID，但我不会。我在这里缺少什么？

Answer 1

哦，这是不稳定种类的奇怪之处。您的row_number()表达式为：

 ROW_NUMBER() OVER (PARTITION BY w.uid
                    ORDER BY  DIFFERENCE(t.val1, w.val1) +
                              DIFFERENCE(t.val2, w.val2) +
                              DIFFERENCE(t.val3, w.val3) +
                              DIFFERENCE(t.val4, w.val4)  DESC
                   ) as Score

问题是多行具有ORDER BY键的相同值。不同的调用任意选择这些多行中的哪一行是第一个，第二个，依此类推。

规范解决方案是包含某种唯一键，以便排序稳定：

 ROW_NUMBER() OVER (PARTITION BY w.uid
                    ORDER BY  (DIFFERENCE(t.val1, w.val1) +
                               DIFFERENCE(t.val2, w.val2) +
                               DIFFERENCE(t.val3, w.val3) +
                               DIFFERENCE(t.val4, w.val4)
                              ) DESC,
                              ??  -- perhaps typeId
                   ) as Score

但是，我可能会建议一个更加艰巨的解决方案。接受可能存在关联的事实，并使用rank()和dense_rank()来识别它们。然后，明确明确在平局的情况下该做什么 - 或许所有这些对你来说同样有趣，或者你可能有其他一些打破关系的方法。

row_number，分区值改变

1 个答案: