Question

我有一个包含一些数据的表，我想为每种类型选择最新数据......

表：

+----+------+------+---------------------+
| ID | data | type | date                |
+----+------+------+---------------------+
|  1 | just |    2 | 2010-08-07 14:24:48 |
|  2 | some |    2 | 2010-08-07 18:07:32 |
|  3 | data |    9 | 2010-08-06 02:52:17 |
|  4 | abcd |    1 | 2010-08-08 17:23:22 |
|  5 | efg1 |    5 | 2010-07-10 21:36:55 |
|  6 | c123 |    5 | 2010-07-10 20:44:36 |
|  7 | bbey |   12 | 2010-08-09 09:01:26 |
+----+------+------+---------------------+

目前我正在使用简单的子查询，看起来一切正常

SELECT `data`,`type`,`date`
FROM `table1`
WHERE `date` = (
                  SELECT MAX( `date` )
                  FROM `table1` AS tbl2
                  WHERE tbl2.`type` = `table1`.`type`
                )
GROUP BY `type`
ORDER BY `type`,`date`

结果：

+------+------+---------------------+
| data | type | date                |
+------+------+---------------------+
| abcd |    1 | 2010-08-08 17:23:22 |
| some |    2 | 2010-08-07 18:07:32 |
| efg1 |    5 | 2010-07-10 21:36:55 |
| data |    9 | 2010-08-06 02:52:17 |
| bbey |   12 | 2010-08-09 09:01:26 |
+------+------+---------------------+

我的问题是有没有更好的方法来做到这一点，一些优化，改进或者可能做到加入？

Answer 1

您正在使用相关子查询。子查询依赖于外部查询，因此必须为外部查询的每一行执行一次。

通常，可以通过将子查询用作派生表来改进。由于子查询作为派生表与外部查询无关，因此该解决方案被认为更具可伸缩性：

SELECT    t1.`data`, t1.`type`, t1.`date`
FROM      `table1` t1
JOIN      (
              SELECT   MAX( `date`) `max_date`, `type`
              FROM     `table1`
              GROUP BY `type`
          ) der_t ON (der_t.`max_date` = t1.`date` AND der_t.`type` = t1.`type`)
GROUP BY  t1.`type`
ORDER BY  t1.`type`, t1.`date`;

测试用例：

CREATE TABLE table1 (id int, data varchar(10), type int, date datetime); 

INSERT INTO table1 VALUES (1, 'just', 2, '2010-08-07 14:24:48');
INSERT INTO table1 VALUES (2, 'some', 2, '2010-08-07 18:07:32');
INSERT INTO table1 VALUES (3, 'data', 9, '2010-08-06 02:52:17');
INSERT INTO table1 VALUES (4, 'abcd', 1, '2010-08-08 17:23:22');
INSERT INTO table1 VALUES (5, 'efg1', 5, '2010-07-10 21:36:55');
INSERT INTO table1 VALUES (6, 'c123', 5, '2010-07-10 20:44:36');
INSERT INTO table1 VALUES (7, 'bbey', 12, '2010-08-09 09:01:26');

结果：

+------+------+---------------------+
| data | type | date                |
+------+------+---------------------+
| abcd |    1 | 2010-08-08 17:23:22 |
| some |    2 | 2010-08-07 18:07:32 |
| efg1 |    5 | 2010-07-10 21:36:55 |
| data |    9 | 2010-08-06 02:52:17 |
| bbey |   12 | 2010-08-09 09:01:26 |
+------+------+---------------------+
5 rows in set (0.00 sec)

通过使用如下解决方案，您看起来也可以完全避免使用子查询：

SELECT     t1.`data`, t1.`type`, t1.`date`
FROM       `table1` t1
LEFT JOIN  `table1` t2 ON (t1.`date` < t2.`date` AND t1.`type` = t2.`type`)
WHERE      t2.`date` IS NULL
GROUP BY   t1.`type`
ORDER BY   t1.`type`, t1.`date`;

通常，这比使用派生表的解决方案更好，但如果性能至关重要，您可能需要测量两个解决方案。 @Naktibalda提供的文章还提供了一些您可能想要测试的其他解决方案。

MySQL查询，子查询优化，SELECT，JOIN

1 个答案: