SQL:为每个唯一键选择最大值?

时间:2009-08-08 00:32:24

标签: sql sql-server database

抱歉,我不确定如何表达这一点,而且我对SQL不是很了解。数据库引擎我是SQL Server Compact。我目前有这个问题:

SELECT *
FROM Samples
WHERE FunctionId NOT IN
(SELECT CalleeId FROM Callers)
ORDER BY ThreadId, HitCount DESC

这给了我:

ThreadId   Function  HitCount
       1        164      6945
       1       3817         1
       4       1328      7053

现在,我只希望结果具有Thread的每个唯一值的最大命中数。换句话说,应该删除第二行。我不知道如何解决这个问题。

[编辑]如果有帮助,这是同一查询的替代形式:

SELECT *
FROM Samples s1
LEFT OUTER JOIN Callers c1
    ON s1.ThreadId = c1.ThreadId AND s1.FunctionId = c1.CalleeId
WHERE c1.ThreadId IS NULL
ORDER BY ThreadId

[编辑]我最终进行了架构更改以避免这样做,因为建议的查询看起来相当昂贵。感谢您的帮助。

3 个答案:

答案 0 :(得分:2)

我将如何做到这一点:

SELECT s1.*
FROM Samples s1
LEFT JOIN Samples s2 
  ON (s1.Thread = s2.Thread and s1.HitCount < s2.HitCount)
WHERE s1.FunctionId NOT IN (SELECT CalleeId FROM Callers) 
  AND s2.Thread IS NULL
ORDER BY s1.ThreadId, s1.HitCount DESC

换句话说,没有其他行s1匹配相同s2并且Thread更高的行HitCount

答案 1 :(得分:2)

SQL Server compact是否支持窗口函数?

备选方案1 - 将包括所有绑定的行。如果给定Thread的唯一行对于HitCount都为null,则不包括行:

SELECT Thread, Function, HitCount
FROM (SELECT Thread, Function, HitCount,
        MAX(HitCount) over (PARTITION BY Thread) as MaxHitCount
    FROM Samples
    WHERE FunctionId NOT IN
        (SELECT CalleeId FROM Callers)) t 
WHERE HitCount = MaxHitCount 
ORDER BY ThreadId, HitCount DESC

备选方案2 - 将包括所有绑定的行。如果具有非null HitCount的给定线程没有行,则将返回该线程的所有行:

SELECT Thread, Function, HitCount
FROM (SELECT Thread, Function, HitCount,
        RANK() over (PARTITION BY Thread ORDER BY HitCount DESC) as R
    FROM Samples
    WHERE FunctionId NOT IN
        (SELECT CalleeId FROM Callers)) t
WHERE R = 1
ORDER BY ThreadId, HitCount DESC

备选方案3 - 如果有关系,将非确定地选择一行并丢弃其他行。如果给定线程的所有行都具有null HitCount

,则将包含一行
SELECT Thread, Function, HitCount
FROM (SELECT Thread, Function, HitCount,
        ROW_NUMBER() over (PARTITION BY Thread ORDER BY HitCount DESC) as R
    FROM Samples
    WHERE FunctionId NOT IN
        (SELECT CalleeId FROM Callers)) t
WHERE R = 1
ORDER BY ThreadId, HitCount DESC

替代4&amp; 5 - 如果窗口函数不可用,则使用较旧的构造,并表示比使用连接更清晰。如果spead是优先考虑的基准。两者都返回参与平局的所有行。如果HitCount的非空值不可用,则替代4将使HitCount为null。备选方案5不会返回HitCount为null的行。

SELECT *
FROM Samples s1
WHERE FunctionId NOT IN
    (SELECT CalleeId FROM Callers)
AND NOT EXISTS
    (SELECT *
    FROM Samples s2
    WHERE s1.FunctionId = s2.FunctionId
    AND s1.HitCount < s2.HitCount)
ORDER BY ThreadId, HitCount DESC

SELECT *
FROM Samples s1
WHERE FunctionId NOT IN
    (SELECT CalleeId FROM Callers)
AND HitCount = 
    (SELECT MAX(HitCount)
    FROM Samples s2
    WHERE s1.FunctionId = s2.FunctionId)
ORDER BY ThreadId, HitCount DESC

答案 2 :(得分:1)

将适用于SQL Server 2005 +:

WITH maxHits AS(
  SELECT s.threadid,
         MAX(s.hitcount) 'maxhits'
    FROM SAMPLES s
    JOIN CALLERS c ON c.threadid = s.threadid AND c.calleeid != s.functionid
GROUP BY s.threadid
)
SELECT t.*
  FROM SAMPLES t
  JOIN CALLERS c ON c.threadid = t.threadid AND c.calleeid != t.functionid
  JOIN maxHits mh ON mh.threadid = t.threadid AND mh.maxhits = t.hitcount

在任何数据库上工作:

SELECT t.*
  FROM SAMPLES t
  JOIN CALLERS c ON c.threadid = t.threadid AND c.calleeid != t.functionid
  JOIN (SELECT s.threadid,
               MAX(s.hitcount) 'maxhits'
          FROM SAMPLES s
          JOIN CALLERS c ON c.threadid = s.threadid AND c.calleeid != s.functionid
      GROUP BY s.threadid) mh ON mh.threadid = t.threadid AND mh.maxhits = t.hitcount