从每个组中选择前1行

时间:2013-03-13 08:56:01

标签: sql sql-server-2012 group-by greatest-n-per-group

我有一个表格,其中列出了已安装的软件版本:

id  | userid | version | datetime
----+--------+---------+------------------------
111 | 75     | 10075   | 2013-03-12 13:40:58.770
112 | 75     | 10079   | 2013-03-12 13:41:01.583
113 | 78     | 10065   | 2013-03-12 14:18:24.463
114 | 78     | 10079   | 2013-03-12 14:22:20.437
115 | 78     | 10079   | 2013-03-12 14:24:01.830
116 | 78     | 10080   | 2013-03-12 14:24:06.893
117 | 74     | 10080   | 2013-03-12 15:31:42.797
118 | 75     | 10079   | 2013-03-13 07:03:56.157
119 | 75     | 10080   | 2013-03-13 07:05:23.137
120 | 65     | 10080   | 2013-03-13 07:24:33.323
121 | 68     | 10080   | 2013-03-13 08:03:24.247
122 | 71     | 10080   | 2013-03-13 08:20:16.173
123 | 78     | 10080   | 2013-03-13 08:28:25.487
124 | 56     | 10080   | 2013-03-13 08:49:44.503

我想显示每个userid的一条记录的所有字段,但只显示最高版本(同时版本为varchar)。

7 个答案:

答案 0 :(得分:7)

如果您使用SQL-Server(最低2005年),则可以使用CTE ROW_NUMBER函数。您可以使用CAST版本来获取正确的订单:

WITH cte 
     AS (SELECT id, 
                userid, 
                version, 
                datetime, 
                Row_number() 
                  OVER ( 
                    partition BY userid 
                    ORDER BY Cast(version AS INT) DESC) rn 
         FROM   [dbo].[table]) 
SELECT id, 
       userid, 
       version, 
       datetime 
FROM   cte 
WHERE  rn = 1 
ORDER BY userid

Demo

即使有多个用户具有相同(顶部)版本,

ROW_NUMBER也会返回一条记录。如果您想要返回所有“最高版本用户记录”,则必须将ROW_NUMBER替换为DENSE_RANK

答案 1 :(得分:7)

您没有指定如何处理领带,但如果您希望显示重复项,则会执行此操作;

SELECT a.* FROM MyTable a
LEFT JOIN MyTable b
  ON a.userid=b.userid
 AND CAST(a.version AS INT) < CAST(b.version AS INT)
WHERE b.version IS NULL

An SQLfiddle to test with

如果你想消除重复项,如果它们存在,请选择最新的副本,你必须稍微扩展一下查询;

WITH cte AS (SELECT *, CAST(version AS INT) num_version FROM MyTable)
SELECT a.id, a.userid, a.version, a.datetime 
FROM cte a LEFT JOIN cte b
  ON a.userid=b.userid
 AND (a.num_version < b.num_version OR 
     (a.num_version = b.num_version AND a.[datetime]<b.[datetime]))
WHERE b.version IS NULL

Another SQLfiddle

答案 2 :(得分:5)

WITH records
AS
(
    SELECT  id, userid, version, datetime,
            ROW_NUMBER() OVER (PARTITION BY userID
                                ORDER BY version DESC) rn
    FROM    tableName
)
SELECT id, userid, version, datetime
FROM    records
WHERE   RN =1 

答案 3 :(得分:0)

select l.* from the_table l
left outer join the_table r
on l.userid = r.userid and l.version < r.version
where r.version is null

答案 4 :(得分:0)

我认为这可以解决您的问题:

 SELECT id,
       userid,
       Version,
       datetime FROM (
           SELECT id,
                  userid,
                  Version,
                  datetime , 
                  DENSE_Rank() over (Partition BY id order by datetime asc) AS Rankk
           FROM [dbo].[table]) RS 
WHERE Rankk<2

我根据你的要求使用RANK函数....

答案 5 :(得分:0)

以下代码将显示您想要的内容并且非常适合性能!

select * from the_table t where cast([version] as int) = 
(select max(cast([version] as int)) from the_table where userid = t.userid)

答案 6 :(得分:0)

如果我的经验调整教会了我什么,那么普遍性就是糟糕的坏事。

但是,如果获得Top X的表格很大(即数十万或数百万)。 CROSS APPLY几乎普遍是最好的。事实上,如果您对它进行基准测试,则交叉应用会始终执行。令人钦佩地在较小的范围内(成千上万)并且始终涵盖与关系的潜在要求。

类似的东西:

select
    id
    ,userid
    ,version
    ,datetime
from
    TheTable t
cross apply
(
    select top 1 --with ties
        id
    from
        TheTable
    where
        userid = t.userid
    order by
        datetime desc
)