我有一个包含一些数据的表,我想为每种类型选择最新数据......
表:
+----+------+------+---------------------+
| ID | data | type | date |
+----+------+------+---------------------+
| 1 | just | 2 | 2010-08-07 14:24:48 |
| 2 | some | 2 | 2010-08-07 18:07:32 |
| 3 | data | 9 | 2010-08-06 02:52:17 |
| 4 | abcd | 1 | 2010-08-08 17:23:22 |
| 5 | efg1 | 5 | 2010-07-10 21:36:55 |
| 6 | c123 | 5 | 2010-07-10 20:44:36 |
| 7 | bbey | 12 | 2010-08-09 09:01:26 |
+----+------+------+---------------------+
目前我正在使用简单的子查询,看起来一切正常
SELECT `data`,`type`,`date`
FROM `table1`
WHERE `date` = (
SELECT MAX( `date` )
FROM `table1` AS tbl2
WHERE tbl2.`type` = `table1`.`type`
)
GROUP BY `type`
ORDER BY `type`,`date`
结果:
+------+------+---------------------+
| data | type | date |
+------+------+---------------------+
| abcd | 1 | 2010-08-08 17:23:22 |
| some | 2 | 2010-08-07 18:07:32 |
| efg1 | 5 | 2010-07-10 21:36:55 |
| data | 9 | 2010-08-06 02:52:17 |
| bbey | 12 | 2010-08-09 09:01:26 |
+------+------+---------------------+
我的问题是有没有更好的方法来做到这一点,一些优化,改进或者可能做到 加入?
答案 0 :(得分:2)
您正在使用相关子查询。子查询依赖于外部查询,因此必须为外部查询的每一行执行一次。
通常,可以通过将子查询用作派生表来改进。由于子查询作为派生表与外部查询无关,因此该解决方案被认为更具可伸缩性:
SELECT t1.`data`, t1.`type`, t1.`date`
FROM `table1` t1
JOIN (
SELECT MAX( `date`) `max_date`, `type`
FROM `table1`
GROUP BY `type`
) der_t ON (der_t.`max_date` = t1.`date` AND der_t.`type` = t1.`type`)
GROUP BY t1.`type`
ORDER BY t1.`type`, t1.`date`;
测试用例:
CREATE TABLE table1 (id int, data varchar(10), type int, date datetime);
INSERT INTO table1 VALUES (1, 'just', 2, '2010-08-07 14:24:48');
INSERT INTO table1 VALUES (2, 'some', 2, '2010-08-07 18:07:32');
INSERT INTO table1 VALUES (3, 'data', 9, '2010-08-06 02:52:17');
INSERT INTO table1 VALUES (4, 'abcd', 1, '2010-08-08 17:23:22');
INSERT INTO table1 VALUES (5, 'efg1', 5, '2010-07-10 21:36:55');
INSERT INTO table1 VALUES (6, 'c123', 5, '2010-07-10 20:44:36');
INSERT INTO table1 VALUES (7, 'bbey', 12, '2010-08-09 09:01:26');
结果:
+------+------+---------------------+
| data | type | date |
+------+------+---------------------+
| abcd | 1 | 2010-08-08 17:23:22 |
| some | 2 | 2010-08-07 18:07:32 |
| efg1 | 5 | 2010-07-10 21:36:55 |
| data | 9 | 2010-08-06 02:52:17 |
| bbey | 12 | 2010-08-09 09:01:26 |
+------+------+---------------------+
5 rows in set (0.00 sec)
通过使用如下解决方案,您看起来也可以完全避免使用子查询:
SELECT t1.`data`, t1.`type`, t1.`date`
FROM `table1` t1
LEFT JOIN `table1` t2 ON (t1.`date` < t2.`date` AND t1.`type` = t2.`type`)
WHERE t2.`date` IS NULL
GROUP BY t1.`type`
ORDER BY t1.`type`, t1.`date`;
通常,这比使用派生表的解决方案更好,但如果性能至关重要,您可能需要测量两个解决方案。 @Naktibalda提供的文章还提供了一些您可能想要测试的其他解决方案。