我有3张桌子:
每个帐户由一台计算机处理。随着时间的推移,帐户可以迁移到不同的计算机,但在给定的一天,它只能由一台计算机处理。如果一个帐户不再有效,那么相应的machine_id为0.给定一个日期,我想查找所有活动帐户,所以我想出了这个查询:
SELECT account.id
FROM account JOIN account_machine m
ON m.account_id=account.id && m.machine_id && m.machine_id=
(SELECT machine_id
FROM account_machine
WHERE account_id=account.id && date<=20170215
ORDER BY date DESC LIMIT 1)
GROUP BY account.id;
这适用于MySQL,但不适用于MariaDB。
MariaDB [db]> select * from account_machine;
+------------+------------+------------+
| date | account_id | machine_id |
+------------+------------+------------+
| 2013-01-01 | 1 | 1 |
| 2013-01-01 | 8 | 1 |
| 2013-01-01 | 2 | 2 |
| 2013-01-01 | 3 | 2 |
| 2013-01-01 | 4 | 3 |
| 2013-01-01 | 12 | 3 |
| 2016-04-01 | 24 | 3 |
| 2013-01-01 | 5 | 5 |
| 2013-01-01 | 6 | 8 |
| 2013-01-01 | 7 | 6 |
| 2014-01-01 | 9 | 6 |
| 2013-01-01 | 10 | 4 |
| 2014-07-01 | 11 | 10 |
| 2014-01-01 | 13 | 7 |
| 2014-01-01 | 14 | 7 |
| 2014-07-01 | 15 | 11 |
| 2014-07-01 | 16 | 14 |
| 2014-07-01 | 17 | 12 |
| 2015-01-01 | 18 | 13 |
| 2015-01-01 | 19 | 13 |
| 2015-04-01 | 20 | 13 |
| 2015-04-01 | 21 | 7 |
| 2015-04-01 | 22 | 13 |
| 2016-04-01 | 23 | 15 |
| 2016-05-01 | 25 | 9 |
| 2016-05-19 | 26 | 4 |
| 2014-08-06 | 1 | 0 |
| 2016-01-15 | 12 | 0 |
| 2015-11-04 | 19 | 12 |
| 2016-05-23 | 10 | 0 |
| 2016-05-26 | 2 | 18 |
| 2016-05-27 | 13 | 16 |
| 2016-06-02 | 27 | 3 |
| 2016-06-02 | 4 | 0 |
| 2016-06-08 | 28 | 17 |
| 2016-06-21 | 29 | 19 |
| 2016-07-11 | 30 | 20 |
| 2016-08-15 | 13 | 0 |
| 2016-08-19 | 2 | 18 |
| 2016-08-25 | 31 | 21 |
| 2016-09-08 | 32 | 20 |
| 2016-11-30 | 19 | 12 |
| 2016-11-30 | 22 | 13 |
| 2017-01-20 | 33 | 15 |
+------------+------------+------------+
MariaDB [db]> select account.id from account join account_machine m on m.account_id=account.id && m.machine_id && m.machine_id=(select a.machine_id from account_machine a where a.account_id=account.id && a.date<=20170215 order by a.date desc limit 1) group by account.id;
+----+
| id |
+----+
| 23 |
| 33 |
+----+
mysql> select account.id from account join account_machine m on m.account_id=account.id && m.machine_id && m.machine_id=(select a.machine_id from account_machine a where a.account_id=account.id && a.date<=20170215 order by a.date desc limit 1) group by account.id;
+----+
| id |
+----+
| 2 |
| 3 |
| 5 |
| 6 |
| 7 |
| 8 |
| 9 |
| 11 |
| 14 |
| 15 |
| 16 |
| 17 |
| 18 |
| 19 |
| 20 |
| 21 |
| 22 |
| 23 |
| 24 |
| 25 |
| 26 |
| 27 |
| 28 |
| 29 |
| 30 |
| 31 |
| 32 |
| 33 |
+----+
P.S。这里有3个表供您重现:
CREATE TABLE `account` (
`id` smallint(5) unsigned NOT NULL AUTO_INCREMENT,
PRIMARY KEY (`id`)
) ENGINE=MyISAM;
INSERT INTO `account` VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12),(13),(14),(15),(16),(17),(18),(19),(20),(21),(22),(23),(24),(25),(26),(27),(28),(29),(30),(31),(32),(33);
CREATE TABLE `account_machine` (
`date` date NOT NULL,
`account_id` smallint(5) unsigned NOT NULL,
`machine_id` smallint(5) unsigned NOT NULL,
PRIMARY KEY (`date`,`account_id`)
) ENGINE=MyISAM;
INSERT INTO `account_machine` VALUES ('2013-01-01',1,1),('2013-01-01',8,1),('2013-01-01',2,2),('2013-01-01',3,2),('2013-01-01',4,3),('2013-01-01',12,3),('2016-04-01',24,3),('2013-01-01',5,5),('2013-01-01',6,8),('2013-01-01',7,6),('2014-01-01',9,6),('2013-01-01',10,4),('2014-07-01',11,10),('2014-01-01',13,7),('2014-01-01',14,7),('2014-07-01',15,11),('2014-07-01',16,14),('2014-07-01',17,12),('2015-01-01',18,13),('2015-01-01',19,13),('2015-04-01',20,13),('2015-04-01',21,7),('2015-04-01',22,13),('2016-04-01',23,15),('2016-05-01',25,9),('2016-05-19',26,4),('2014-08-06',1,0),('2016-01-15',12,0),('2015-11-04',19,12),('2016-05-23',10,0),('2016-05-26',2,18),('2016-05-27',13,16),('2016-06-02',27,3),('2016-06-02',4,0),('2016-06-08',28,17),('2016-06-21',29,19),('2016-07-11',30,20),('2016-08-15',13,0),('2016-08-19',2,18),('2016-08-25',31,21),('2016-09-08',32,20),('2016-11-30',19,12),('2016-11-30',22,13),('2017-01-20',33,15);
CREATE TABLE `machine` (
`id` smallint(5) unsigned NOT NULL AUTO_INCREMENT,
PRIMARY KEY (`id`)
) ENGINE=MyISAM;
INSERT INTO `machine` VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12),(13),(14),(15),(16),(17),(18),(19),(20),(21),(22);
答案 0 :(得分:1)
这样的事情怎么样?
SELECT am1.account_id AS id
FROM account_machine am1
JOIN (
SELECT account_id, MAX(date) AS date
FROM account_machine
GROUP BY account_id
) am2
ON am1.account_id = am2.account_id
AND am1.date = am2.date
AND am1.machine_id != 0
ORDER BY am1.account_id;
+----+
| id |
+----+
| 2 |
| 3 |
| 5 |
| 6 |
| 7 |
| 8 |
| 9 |
| 11 |
| 14 |
| 15 |
| 16 |
| 17 |
| 18 |
| 19 |
| 20 |
| 21 |
| 22 |
| 23 |
| 24 |
| 25 |
| 26 |
| 27 |
| 28 |
| 29 |
| 30 |
| 31 |
| 32 |
| 33 |
+----+
28 rows in set (0.00 sec)
我很想知道来自MySQL和MariaDB的EXPLAIN EXTENDED / SHOW WARNINGS的输出。这将显示查询优化器如何重写查询。例如:
root@localhost [stack]> EXPLAIN EXTENDED SELECT am1.account_id AS id
-> FROM account_machine am1
-> JOIN (
-> SELECT account_id, MAX(date) AS date
-> FROM account_machine
-> GROUP BY account_id
-> ) am2
-> ON am1.account_id = am2.account_id
-> AND am1.date = am2.date
-> AND am1.machine_id != 0
-> ORDER BY am1.account_id\G
*************************** 1. row ***************************
id: 1
select_type: PRIMARY
table: <derived2>
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 44
filtered: 100.00
Extra: Using where; Using temporary; Using filesort
*************************** 2. row ***************************
id: 1
select_type: PRIMARY
table: am1
type: eq_ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 5
ref: am2.date,am2.account_id
rows: 1
filtered: 100.00
Extra: Using where
*************************** 3. row ***************************
id: 2
select_type: DERIVED
table: account_machine
type: index
possible_keys: NULL
key: PRIMARY
key_len: 5
ref: NULL
rows: 44
filtered: 100.00
Extra: Using index; Using temporary; Using filesort
3 rows in set, 1 warning (0.00 sec)
root@localhost [stack]> SHOW WARNINGS\G
*************************** 1. row ***************************
Level: Note
Code: 1003
Message: select `stack`.`am1`.`account_id` AS `id` from
`stack`.`account_machine` `am1` join (select
`stack`.`account_machine`.`account_id` AS
`account_id`,max(`stack`.`account_machine`.`date`) AS `date` from
`stack`.`account_machine` group by
`stack`.`account_machine`.`account_id`) `am2` where
((`stack`.`am1`.`account_id` = `am2`.`account_id`) and
(`stack`.`am1`.`date` = `am2`.`date`) and (`stack`.`am1`.`machine_id`
<> 0)) order by `stack`.`am1`.`account_id`
1 row in set (0.00 sec)
显然不是没有索引的高性能查询,但对于有限的数据集,它没什么问题。
答案 1 :(得分:0)
我怀疑您的查询存在设计缺陷 - 如果子查询返回AT1G01030,AT3G06520,0.61732,0.17639545,0.23569,0.58557,4.0,0.6640215
AT1G01030,AT1G55280,0.57287,0.20705527,0.19536,0.52857,4.0,0.6048262
AT1G01030,AT1G80040,0.56268,0.22935495,0.18583998,0.52728,4.0,-0.5773431
AT1G01030,AT1G32310,0.67958,0.4832027,0.32644996,0.63247,4.0,-0.44314474
AT1G01030,AT5G30490,0.56509,0.37536618,0.16172999,0.51847,4.0,-0.43557298
AT1G01030,AT5G42580,0.61579,0.5019064,0.30105,0.58143,4.0,0.33746648
account_id
。在那之后,它将不会再看了。
使用machine_id = 0
时,将{em>仅加入信息放在JOIN...ON
子句中,而不是过滤信息是一种好方法;进入ON
。
看起来这样会更简单,更快:
WHERE
也许SELECT account_id
FROM account_machine AS m
WHERE machine_id != 0
AND date <= 20170215
AND EXISTS (
SELECT *
FROM account
WHERE id = m.account_id
)
ORDER BY date DESC
LIMIT 1;
测试是多余的,可以删除吗?
EXISTS()
可能有助于提高效果。
(不,我没有发现为什么两台服务器的工作方式不同。看看我的版本是否有效。)