如何通过SQL中的另一列选择MAX(列值),DISTINCT的行?

时间:2009-03-04 20:14:27

标签: mysql sql max distinct greatest-n-per-group

我的表是:

id  home  datetime     player   resource
---|-----|------------|--------|---------
1  | 10  | 04/03/2009 | john   | 399 
2  | 11  | 04/03/2009 | juliet | 244
5  | 12  | 04/03/2009 | borat  | 555
3  | 10  | 03/03/2009 | john   | 300
4  | 11  | 03/03/2009 | juliet | 200
6  | 12  | 03/03/2009 | borat  | 500
7  | 13  | 24/12/2008 | borat  | 600
8  | 13  | 01/01/2009 | borat  | 700

我需要选择保持home最大值的每个不同datetime

结果将是:

id  home  datetime     player   resource 
---|-----|------------|--------|---------
1  | 10  | 04/03/2009 | john   | 399
2  | 11  | 04/03/2009 | juliet | 244
5  | 12  | 04/03/2009 | borat  | 555
8  | 13  | 01/01/2009 | borat  | 700

我尝试过:

-- 1 ..by the MySQL manual: 

SELECT DISTINCT
  home,
  id,
  datetime AS dt,
  player,
  resource
FROM topten t1
WHERE datetime = (SELECT
  MAX(t2.datetime)
FROM topten t2
GROUP BY home)
GROUP BY datetime
ORDER BY datetime DESC

不起作用。尽管数据库保持187,但结果集有130行。 结果包含一些重复的home

-- 2 ..join

SELECT
  s1.id,
  s1.home,
  s1.datetime,
  s1.player,
  s1.resource
FROM topten s1
JOIN (SELECT
  id,
  MAX(datetime) AS dt
FROM topten
GROUP BY id) AS s2
  ON s1.id = s2.id
ORDER BY datetime 

不。提供所有记录。

-- 3 ..something exotic: 

有各种结果。

21 个答案:

答案 0 :(得分:860)

你真是太近了!您需要做的就是选择主页及其最长日期时间,然后返回到两个字段上的topten表:

SELECT tt.*
FROM topten tt
INNER JOIN
    (SELECT home, MAX(datetime) AS MaxDateTime
    FROM topten
    GROUP BY home) groupedtt 
ON tt.home = groupedtt.home 
AND tt.datetime = groupedtt.MaxDateTime

答案 1 :(得分:70)

这里有 T-SQL 版本:

-- Test data
DECLARE @TestTable TABLE (id INT, home INT, date DATETIME, 
  player VARCHAR(20), resource INT)
INSERT INTO @TestTable
SELECT 1, 10, '2009-03-04', 'john', 399 UNION
SELECT 2, 11, '2009-03-04', 'juliet', 244 UNION
SELECT 5, 12, '2009-03-04', 'borat', 555 UNION
SELECT 3, 10, '2009-03-03', 'john', 300 UNION
SELECT 4, 11, '2009-03-03', 'juliet', 200 UNION
SELECT 6, 12, '2009-03-03', 'borat', 500 UNION
SELECT 7, 13, '2008-12-24', 'borat', 600 UNION
SELECT 8, 13, '2009-01-01', 'borat', 700

-- Answer
SELECT id, home, date, player, resource 
FROM (SELECT id, home, date, player, resource, 
    RANK() OVER (PARTITION BY home ORDER BY date DESC) N
    FROM @TestTable
)M WHERE N = 1

-- and if you really want only home with max date
SELECT T.id, T.home, T.date, T.player, T.resource 
    FROM @TestTable T
INNER JOIN 
(   SELECT TI.id, TI.home, TI.date, 
        RANK() OVER (PARTITION BY TI.home ORDER BY TI.date) N
    FROM @TestTable TI
    WHERE TI.date IN (SELECT MAX(TM.date) FROM @TestTable TM)
)TJ ON TJ.N = 1 AND T.id = TJ.id

修改
不幸的是,MySQL中没有RANK()OVER函数 但可以模仿,见Emulating Analytic (AKA Ranking) Functions with MySQL 所以这是 MySQL 版本:

SELECT id, home, date, player, resource 
FROM TestTable AS t1 
WHERE 
    (SELECT COUNT(*) 
            FROM TestTable AS t2 
            WHERE t2.home = t1.home AND t2.date > t1.date
    ) = 0

答案 2 :(得分:64)

最快的MySQL解决方案,没有内部查询且没有GROUP BY

SELECT m.*                    -- get the row that contains the max value
FROM topten m                 -- "m" from "max"
    LEFT JOIN topten b        -- "b" from "bigger"
        ON m.home = b.home    -- match "max" row with "bigger" row by `home`
        AND m.datetime < b.datetime           -- want "bigger" than "max"
WHERE b.datetime IS NULL      -- keep only if there is no bigger than max

<强>解释

使用home列自行加入表格。使用LEFT JOIN可确保表m中的所有行都显示在结果集中。表b中没有匹配项的NULL列{}为b

JOIN上的其他条件要求仅匹配b列中datetime列上的值大于m中的行的LEFT JOIN行。

使用问题中发布的数据,+------------------------------------------+--------------------------------+ | the row from `m` | the matching row from `b` | |------------------------------------------|--------------------------------| | id home datetime player resource | id home datetime ... | |----|-----|------------|--------|---------|------|------|------------|-----| | 1 | 10 | 04/03/2009 | john | 399 | NULL | NULL | NULL | ... | * | 2 | 11 | 04/03/2009 | juliet | 244 | NULL | NULL | NULL | ... | * | 5 | 12 | 04/03/2009 | borat | 555 | NULL | NULL | NULL | ... | * | 3 | 10 | 03/03/2009 | john | 300 | 1 | 10 | 04/03/2009 | ... | | 4 | 11 | 03/03/2009 | juliet | 200 | 2 | 11 | 04/03/2009 | ... | | 6 | 12 | 03/03/2009 | borat | 500 | 5 | 12 | 04/03/2009 | ... | | 7 | 13 | 24/12/2008 | borat | 600 | 8 | 13 | 01/01/2009 | ... | | 8 | 13 | 01/01/2009 | borat | 700 | NULL | NULL | NULL | ... | * +------------------------------------------+--------------------------------+ 将生成此对:

WHERE

最后,NULL子句只保留b列中*的对(在上表中标有JOIN);这意味着,由于m子句中的第二个条件,从datetime中选择的行在{{1}}列中具有最大值。

阅读SQL Antipatterns: Avoiding the Pitfalls of Database Programming书籍了解其他SQL技巧。

答案 3 :(得分:26)

即使每个home有两行或多行,并且DATETIME位相同,这也会有效:

SELECT id, home, datetime, player, resource
FROM   (
       SELECT (
              SELECT  id
              FROM    topten ti
              WHERE   ti.home = t1.home
              ORDER BY
                      ti.datetime DESC
              LIMIT 1
              ) lid
       FROM   (
              SELECT  DISTINCT home
              FROM    topten
              ) t1
       ) ro, topten t2
WHERE  t2.id = ro.lid

答案 4 :(得分:23)

我认为这会给你想要的结果:

SELECT   home, MAX(datetime)
FROM     my_table
GROUP BY home

如果您还需要其他列,只需与原始表进行联接(查看Michael La Voie answer)

最好的问候。

答案 5 :(得分:16)

由于人们似乎继续遇到这个帖子(评论日期范围从1。5年起)并不是那么简单:

SELECT * FROM (SELECT * FROM topten ORDER BY datetime DESC) tmp GROUP BY home

不需要聚合功能......

干杯。

答案 6 :(得分:10)

您也可以尝试这一个,对于大型表,查询性能会更好。它适用于每个家庭的记录不超过两个,并且它们的日期不同。更好的一般MySQL查询是上面的Michael La Voie的一个。

SELECT t1.id, t1.home, t1.date, t1.player, t1.resource
FROM   t_scores_1 t1 
INNER JOIN t_scores_1 t2
   ON t1.home = t2.home
WHERE t1.date > t2.date

或者在Postgres或提供分析功能的dbs的情况下尝试

SELECT t.* FROM 
(SELECT t1.id, t1.home, t1.date, t1.player, t1.resource
  , row_number() over (partition by t1.home order by t1.date desc) rw
 FROM   topten t1 
 INNER JOIN topten t2
   ON t1.home = t2.home
 WHERE t1.date > t2.date 
) t
WHERE t.rw = 1

答案 7 :(得分:8)

这适用于Oracle:

with table_max as(
  select id
       , home
       , datetime
       , player
       , resource
       , max(home) over (partition by home) maxhome
    from table  
)
select id
     , home
     , datetime
     , player
     , resource
  from table_max
 where home = maxhome

答案 8 :(得分:7)

尝试使用SQL Server:

WITH cte AS (
   SELECT home, MAX(year) AS year FROM Table1 GROUP BY home
)
SELECT * FROM Table1 a INNER JOIN cte ON a.home = cte.home AND a.year = cte.year

答案 9 :(得分:7)

SELECT  tt.*
FROM    TestTable tt 
INNER JOIN 
        (
        SELECT  coord, MAX(datetime) AS MaxDateTime 
        FROM    rapsa 
        GROUP BY
                krd 
        ) groupedtt
ON      tt.coord = groupedtt.coord
        AND tt.datetime = groupedtt.MaxDateTime

答案 10 :(得分:4)

SELECT c1, c2, c3, c4, c5 FROM table1 WHERE c3 = (select max(c3) from table)

SELECT * FROM table1 WHERE c3 = (select max(c3) from table1)

答案 11 :(得分:4)

这是MySQL版本,它只打印一个条目中存在重复MAX(日期时间)的条目。

你可以在这里测试http://www.sqlfiddle.com/#!2/0a4ae/1

样本数据

mysql> SELECT * from topten;
+------+------+---------------------+--------+----------+
| id   | home | datetime            | player | resource |
+------+------+---------------------+--------+----------+
|    1 |   10 | 2009-04-03 00:00:00 | john   |      399 |
|    2 |   11 | 2009-04-03 00:00:00 | juliet |      244 |
|    3 |   10 | 2009-03-03 00:00:00 | john   |      300 |
|    4 |   11 | 2009-03-03 00:00:00 | juliet |      200 |
|    5 |   12 | 2009-04-03 00:00:00 | borat  |      555 |
|    6 |   12 | 2009-03-03 00:00:00 | borat  |      500 |
|    7 |   13 | 2008-12-24 00:00:00 | borat  |      600 |
|    8 |   13 | 2009-01-01 00:00:00 | borat  |      700 |
|    9 |   10 | 2009-04-03 00:00:00 | borat  |      700 |
|   10 |   11 | 2009-04-03 00:00:00 | borat  |      700 |
|   12 |   12 | 2009-04-03 00:00:00 | borat  |      700 |
+------+------+---------------------+--------+----------+

带有用户变量

的MySQL版本
SELECT *
FROM (
    SELECT ord.*,
        IF (@prev_home = ord.home, 0, 1) AS is_first_appear,
        @prev_home := ord.home
    FROM (
        SELECT t1.id, t1.home, t1.player, t1.resource
        FROM topten t1
        INNER JOIN (
            SELECT home, MAX(datetime) AS mx_dt
            FROM topten
            GROUP BY home
          ) x ON t1.home = x.home AND t1.datetime = x.mx_dt
        ORDER BY home
    ) ord, (SELECT @prev_home := 0, @seq := 0) init
) y
WHERE is_first_appear = 1;
+------+------+--------+----------+-----------------+------------------------+
| id   | home | player | resource | is_first_appear | @prev_home := ord.home |
+------+------+--------+----------+-----------------+------------------------+
|    9 |   10 | borat  |      700 |               1 |                     10 |
|   10 |   11 | borat  |      700 |               1 |                     11 |
|   12 |   12 | borat  |      700 |               1 |                     12 |
|    8 |   13 | borat  |      700 |               1 |                     13 |
+------+------+--------+----------+-----------------+------------------------+
4 rows in set (0.00 sec)

接受答案'outout

SELECT tt.*
FROM topten tt
INNER JOIN
    (
    SELECT home, MAX(datetime) AS MaxDateTime
    FROM topten
    GROUP BY home
) groupedtt ON tt.home = groupedtt.home AND tt.datetime = groupedtt.MaxDateTime
+------+------+---------------------+--------+----------+
| id   | home | datetime            | player | resource |
+------+------+---------------------+--------+----------+
|    1 |   10 | 2009-04-03 00:00:00 | john   |      399 |
|    2 |   11 | 2009-04-03 00:00:00 | juliet |      244 |
|    5 |   12 | 2009-04-03 00:00:00 | borat  |      555 |
|    8 |   13 | 2009-01-01 00:00:00 | borat  |      700 |
|    9 |   10 | 2009-04-03 00:00:00 | borat  |      700 |
|   10 |   11 | 2009-04-03 00:00:00 | borat  |      700 |
|   12 |   12 | 2009-04-03 00:00:00 | borat  |      700 |
+------+------+---------------------+--------+----------+
7 rows in set (0.00 sec)

答案 12 :(得分:4)

使用子查询对每组最近一行进行gt的另一种方法,该查询基本上计算每组每行的排名,然后按照rank = 1筛选出最近的行

select a.*
from topten a
where (
  select count(*)
  from topten b
  where a.home = b.home
  and a.`datetime` < b.`datetime`
) +1 = 1

DEMO

为了更好地理解,以下是每行排名号的visual demo

  

通过阅读一些评论如果有两行具有相同的&#39; home&#39;和&#39; datetime&#39;字段值?

以上查询将失败,并将针对上述情况返回超过1行。为了掩盖这种情况,将需要另一个标准/参数/列来决定应该采取哪一行属于上述情况。通过查看样本数据集,我假设有一个主键列id应该设置为自动增量。因此,我们可以使用此列通过在CASE语句的帮助下调整相同的查询来选择最近的行,如

select a.*
from topten a
where (
  select count(*)
  from topten b
  where a.home = b.home
  and  case 
       when a.`datetime` = b.`datetime`
       then a.id < b.id
       else a.`datetime` < b.`datetime`
       end
) + 1 = 1

DEMO

以上查询将选择具有相同datetime

中最高ID的行

visual demo,每行的排名为

答案 13 :(得分:1)

试试这个

select * from mytable a join
(select home, max(datetime) datetime
from mytable
group by home) b
 on a.home = b.home and a.datetime = b.datetime

此致 ķ

答案 14 :(得分:1)

为什么不使用: SELECT home,MAX(datetime)AS MaxDateTime,player,resource FROM topten GROUP BY home 我错过了什么吗?

答案 15 :(得分:1)

希望以下查询将给出所需的输出:

Select id, home,datetime,player,resource, row_number() over (Partition by home ORDER by datetime desc) as rownum from tablename where rownum=1

答案 16 :(得分:0)

这是您需要的查询:

 SELECT b.id, a.home,b.[datetime],b.player,a.resource FROM
 (SELECT home,MAX(resource) AS resource FROM tbl_1 GROUP BY home) AS a

 LEFT JOIN

 (SELECT id,home,[datetime],player,resource FROM tbl_1) AS b
 ON  a.resource = b.resource WHERE a.home =b.home;

答案 17 :(得分:0)

@Michae在大多数情况下,已接受的答案将正常工作,但如果出现以下情况则会失败。

如果有2行具有HomeID和Datetime相同,则查询将返回两行,而不是根据需要返回不同的HomeID,如下所示在查询中添加Distinct。

SELECT DISTINCT tt.home  , tt.MaxDateTime
FROM topten tt
INNER JOIN
    (SELECT home, MAX(datetime) AS MaxDateTime
    FROM topten
    GROUP BY home) groupedtt 
ON tt.home = groupedtt.home 
AND tt.datetime = groupedtt.MaxDateTime

答案 18 :(得分:0)

(注意:Michael的答案非常适合目标列datetime不能为每个不同的home具有重复值的情况。)

如果您的表中有{strong> home x datetime的重复行,并且您只需要为每个home单独的列选择一行,这就是我的解决方案对此:

您的表需要一个唯一的列(如id)。如果没有,请创建一个视图并向其中添加随机列。

使用此查询为每个唯一的home值选择一行。如果重复id,则选择最低的datetime

SELECT tt.*
FROM topten tt
INNER JOIN
    (
    SELECT min(id) as min_id, home from topten tt2
    INNER JOIN 
        (
        SELECT home, MAX(datetime) AS MaxDateTime
        FROM topten
        GROUP BY home) groupedtt2
    ON tt2.home = groupedtt2.home
    ) as groupedtt
ON tt.id = groupedtt.id

答案 19 :(得分:0)

在 MySQL 8.0 中,这可以通过使用带有公共表表达式的 row_number() 窗口函数来有效地实现。

(这里的 row_number() 基本上为每个从 1 开始按资源降序排列的玩家的每一行生成唯一的序列。因此,对于序列号为 1 的每个玩家行将具有最高的资源值。现在我们需要做的就是正在为每个玩家选择序列号为 1 的行。可以通过围绕此查询编写外部查询来完成。但我们改用公共表表达式,因为它更具可读性。)

架构:

 create  TABLE TestTable(id INT, home INT, date DATETIME, 
   player VARCHAR(20), resource INT);
 INSERT INTO TestTable
 SELECT 1, 10, '2009-03-04', 'john', 399 UNION
 SELECT 2, 11, '2009-03-04', 'juliet', 244 UNION
 SELECT 5, 12, '2009-03-04', 'borat', 555 UNION
 SELECT 3, 10, '2009-03-03', 'john', 300 UNION
 SELECT 4, 11, '2009-03-03', 'juliet', 200 UNION
 SELECT 6, 12, '2009-03-03', 'borat', 500 UNION
 SELECT 7, 13, '2008-12-24', 'borat', 600 UNION
 SELECT 8, 13, '2009-01-01', 'borat', 700

查询:

 with cte as 
 (
     select id, home, date , player, resource, 
     Row_Number()Over(Partition by home order by date desc) rownumber from TestTable
 )
 select id, home, date , player, resource from cte where rownumber=1

输出:

<头>
id home 日期 玩家 资源
1 10 2009-03-04 00:00:00 约翰 399
2 11 2009-03-04 00:00:00 朱丽叶 244
5 12 2009-03-04 00:00:00 borat 555
8 13 2009-01-01 00:00:00 borat 700

db<>fiddle here

答案 20 :(得分:0)

如果有 2 条日期和家相同的记录,则接受的答案对我不起作用。加入后会返回2条记录。虽然我需要选择其中的任何(随机)。此查询用作连接的子查询,因此在那里不可能只限制 1。 这是我达到预期结果的方式。但是不知道性能。

select SUBSTRING_INDEX(GROUP_CONCAT(id order by datetime desc separator ','),',',1) as id, home, MAX(datetime) as 'datetime'
 from topten
 group by (home)