如何在MariaDB中选择每个组中的最新成员?

时间:2017-02-17 08:09:24

标签: mysql group-by subquery mariadb

我有3张桌子:

  1. 帐户 - 帐户信息
  2. 机器 - 机器信息
  3. account_machine - 将帐户映射到日期的计算机
  4. 每个帐户由一台计算机处理。随着时间的推移,帐户可以迁移到不同的计算机,但在给定的一天,它只能由一台计算机处理。如果一个帐户不再有效,那么相应的machine_id为0.给定一个日期,我想查找所有活动帐户,所以我想出了这个查询:

    SELECT account.id 
    FROM account JOIN account_machine m 
    ON m.account_id=account.id && m.machine_id && m.machine_id=
    (SELECT machine_id 
    FROM account_machine 
    WHERE account_id=account.id && date<=20170215 
    ORDER BY date DESC LIMIT 1) 
    GROUP BY account.id;
    

    这适用于MySQL,但不适用于MariaDB。

    MariaDB [db]> select * from account_machine;
    +------------+------------+------------+
    | date       | account_id | machine_id |
    +------------+------------+------------+
    | 2013-01-01 |          1 |          1 |
    | 2013-01-01 |          8 |          1 |
    | 2013-01-01 |          2 |          2 |
    | 2013-01-01 |          3 |          2 |
    | 2013-01-01 |          4 |          3 |
    | 2013-01-01 |         12 |          3 |
    | 2016-04-01 |         24 |          3 |
    | 2013-01-01 |          5 |          5 |
    | 2013-01-01 |          6 |          8 |
    | 2013-01-01 |          7 |          6 |
    | 2014-01-01 |          9 |          6 |
    | 2013-01-01 |         10 |          4 |
    | 2014-07-01 |         11 |         10 |
    | 2014-01-01 |         13 |          7 |
    | 2014-01-01 |         14 |          7 |
    | 2014-07-01 |         15 |         11 |
    | 2014-07-01 |         16 |         14 |
    | 2014-07-01 |         17 |         12 |
    | 2015-01-01 |         18 |         13 |
    | 2015-01-01 |         19 |         13 |
    | 2015-04-01 |         20 |         13 |
    | 2015-04-01 |         21 |          7 |
    | 2015-04-01 |         22 |         13 |
    | 2016-04-01 |         23 |         15 |
    | 2016-05-01 |         25 |          9 |
    | 2016-05-19 |         26 |          4 |
    | 2014-08-06 |          1 |          0 |
    | 2016-01-15 |         12 |          0 |
    | 2015-11-04 |         19 |         12 |
    | 2016-05-23 |         10 |          0 |
    | 2016-05-26 |          2 |         18 |
    | 2016-05-27 |         13 |         16 |
    | 2016-06-02 |         27 |          3 |
    | 2016-06-02 |          4 |          0 |
    | 2016-06-08 |         28 |         17 |
    | 2016-06-21 |         29 |         19 |
    | 2016-07-11 |         30 |         20 |
    | 2016-08-15 |         13 |          0 |
    | 2016-08-19 |          2 |         18 |
    | 2016-08-25 |         31 |         21 |
    | 2016-09-08 |         32 |         20 |
    | 2016-11-30 |         19 |         12 |
    | 2016-11-30 |         22 |         13 |
    | 2017-01-20 |         33 |         15 |
    +------------+------------+------------+
    
    MariaDB [db]> select account.id from account join account_machine m on m.account_id=account.id && m.machine_id && m.machine_id=(select a.machine_id from account_machine a where a.account_id=account.id && a.date<=20170215 order by a.date desc limit 1) group by account.id;
    +----+
    | id |
    +----+
    | 23 |
    | 33 |
    +----+
    
    mysql> select account.id from account join account_machine m on m.account_id=account.id && m.machine_id && m.machine_id=(select a.machine_id from account_machine a where a.account_id=account.id && a.date<=20170215 order by a.date desc limit 1) group by account.id;
    +----+
    | id |
    +----+
    |  2 |
    |  3 |
    |  5 |
    |  6 |
    |  7 |
    |  8 |
    |  9 |
    | 11 |
    | 14 |
    | 15 |
    | 16 |
    | 17 |
    | 18 |
    | 19 |
    | 20 |
    | 21 |
    | 22 |
    | 23 |
    | 24 |
    | 25 |
    | 26 |
    | 27 |
    | 28 |
    | 29 |
    | 30 |
    | 31 |
    | 32 |
    | 33 |
    +----+
    

    P.S。这里有3个表供您重现:

    CREATE TABLE `account` (
      `id` smallint(5) unsigned NOT NULL AUTO_INCREMENT,
      PRIMARY KEY (`id`)
    ) ENGINE=MyISAM;
    INSERT INTO `account` VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12),(13),(14),(15),(16),(17),(18),(19),(20),(21),(22),(23),(24),(25),(26),(27),(28),(29),(30),(31),(32),(33);
    
    CREATE TABLE `account_machine` (
      `date` date NOT NULL,
      `account_id` smallint(5) unsigned NOT NULL,
      `machine_id` smallint(5) unsigned NOT NULL,
      PRIMARY KEY (`date`,`account_id`)
    ) ENGINE=MyISAM;
    INSERT INTO `account_machine` VALUES ('2013-01-01',1,1),('2013-01-01',8,1),('2013-01-01',2,2),('2013-01-01',3,2),('2013-01-01',4,3),('2013-01-01',12,3),('2016-04-01',24,3),('2013-01-01',5,5),('2013-01-01',6,8),('2013-01-01',7,6),('2014-01-01',9,6),('2013-01-01',10,4),('2014-07-01',11,10),('2014-01-01',13,7),('2014-01-01',14,7),('2014-07-01',15,11),('2014-07-01',16,14),('2014-07-01',17,12),('2015-01-01',18,13),('2015-01-01',19,13),('2015-04-01',20,13),('2015-04-01',21,7),('2015-04-01',22,13),('2016-04-01',23,15),('2016-05-01',25,9),('2016-05-19',26,4),('2014-08-06',1,0),('2016-01-15',12,0),('2015-11-04',19,12),('2016-05-23',10,0),('2016-05-26',2,18),('2016-05-27',13,16),('2016-06-02',27,3),('2016-06-02',4,0),('2016-06-08',28,17),('2016-06-21',29,19),('2016-07-11',30,20),('2016-08-15',13,0),('2016-08-19',2,18),('2016-08-25',31,21),('2016-09-08',32,20),('2016-11-30',19,12),('2016-11-30',22,13),('2017-01-20',33,15);
    
    CREATE TABLE `machine` (
      `id` smallint(5) unsigned NOT NULL AUTO_INCREMENT,
      PRIMARY KEY (`id`)
    ) ENGINE=MyISAM;
    INSERT INTO `machine` VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12),(13),(14),(15),(16),(17),(18),(19),(20),(21),(22);
    

2 个答案:

答案 0 :(得分:1)

这样的事情怎么样?

SELECT am1.account_id AS id
FROM account_machine am1
JOIN (
    SELECT account_id, MAX(date) AS date
    FROM account_machine
    GROUP BY account_id
    ) am2
ON am1.account_id = am2.account_id
AND am1.date = am2.date
AND am1.machine_id != 0
ORDER BY am1.account_id;

+----+
| id |
+----+
|  2 |
|  3 |
|  5 |
|  6 |
|  7 |
|  8 |
|  9 |
| 11 |
| 14 |
| 15 |
| 16 |
| 17 |
| 18 |
| 19 |
| 20 |
| 21 |
| 22 |
| 23 |
| 24 |
| 25 |
| 26 |
| 27 |
| 28 |
| 29 |
| 30 |
| 31 |
| 32 |
| 33 |
+----+
28 rows in set (0.00 sec)

我很想知道来自MySQL和MariaDB的EXPLAIN EXTENDED / SHOW WARNINGS的输出。这将显示查询优化器如何重写查询。例如:

root@localhost [stack]> EXPLAIN EXTENDED SELECT am1.account_id AS id
    -> FROM account_machine am1
    -> JOIN (
    ->     SELECT account_id, MAX(date) AS date
    ->     FROM account_machine
    ->     GROUP BY account_id
    -> ) am2
    -> ON am1.account_id = am2.account_id
    -> AND am1.date = am2.date
    -> AND am1.machine_id != 0
    -> ORDER BY am1.account_id\G
*************************** 1. row ***************************
           id: 1
  select_type: PRIMARY
        table: <derived2>
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 44
     filtered: 100.00
        Extra: Using where; Using temporary; Using filesort
*************************** 2. row ***************************
           id: 1
  select_type: PRIMARY
        table: am1
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 5
          ref: am2.date,am2.account_id
         rows: 1
     filtered: 100.00
        Extra: Using where
*************************** 3. row ***************************
           id: 2
  select_type: DERIVED
        table: account_machine
         type: index
possible_keys: NULL
          key: PRIMARY
      key_len: 5
          ref: NULL
         rows: 44
     filtered: 100.00
        Extra: Using index; Using temporary; Using filesort
3 rows in set, 1 warning (0.00 sec)

root@localhost [stack]> SHOW WARNINGS\G
*************************** 1. row ***************************
  Level: Note
   Code: 1003
Message: select `stack`.`am1`.`account_id` AS `id` from
`stack`.`account_machine` `am1` join (select 
`stack`.`account_machine`.`account_id` AS 
`account_id`,max(`stack`.`account_machine`.`date`) AS `date` from 
`stack`.`account_machine` group by 
`stack`.`account_machine`.`account_id`) `am2` where 
((`stack`.`am1`.`account_id` = `am2`.`account_id`) and 
(`stack`.`am1`.`date` = `am2`.`date`) and (`stack`.`am1`.`machine_id` 
<> 0)) order by `stack`.`am1`.`account_id`
1 row in set (0.00 sec)

显然不是没有索引的高性能查询,但对于有限的数据集,它没什么问题。

答案 1 :(得分:0)

我怀疑您的查询存在设计缺陷 - 如果子查询返回AT1G01030,AT3G06520,0.61732,0.17639545,0.23569,0.58557,4.0,0.6640215 AT1G01030,AT1G55280,0.57287,0.20705527,0.19536,0.52857,4.0,0.6048262 AT1G01030,AT1G80040,0.56268,0.22935495,0.18583998,0.52728,4.0,-0.5773431 AT1G01030,AT1G32310,0.67958,0.4832027,0.32644996,0.63247,4.0,-0.44314474 AT1G01030,AT5G30490,0.56509,0.37536618,0.16172999,0.51847,4.0,-0.43557298 AT1G01030,AT5G42580,0.61579,0.5019064,0.30105,0.58143,4.0,0.33746648 account_id。在那之后,它将不会再看了。

使用machine_id = 0时,将{em>仅加入信息放在JOIN...ON子句中,而不是过滤信息是一种好方法;进入ON

看起来这样会更简单,更快:

WHERE

也许SELECT account_id FROM account_machine AS m WHERE machine_id != 0 AND date <= 20170215 AND EXISTS ( SELECT * FROM account WHERE id = m.account_id ) ORDER BY date DESC LIMIT 1; 测试是多余的,可以删除吗?

EXISTS()可能有助于提高效果。

(不,我没有发现为什么两台服务器的工作方式不同。看看我的版本是否有效。)