带有WHERE的SQL MAX和GROUP BY

时间:2015-01-08 11:27:28

标签: mysql sql

鉴于下表:

CREATE TABLE `test` (
    `id` BIGINT(20) UNSIGNED NOT NULL AUTO_INCREMENT,
    `device_id` INT(11) UNSIGNED NOT NULL,
    `distincted` BIT(1) NOT NULL DEFAULT b'0',
    `timestamp_detected` DATETIME NOT NULL,
    PRIMARY KEY (`id`),
    INDEX `idx1` (`device_id`),
    INDEX `idx2` (`device_id`, `timestamp_detected`),
    CONSTRAINT `test_ibfk_1` FOREIGN KEY (`device_id`) REFERENCES `device` (`id`)
)
COLLATE='utf8mb4_general_ci'
ENGINE=InnoDB
ROW_FORMAT=COMPACT;

我希望timestamp_detecteddevice_id分组执行分组最大值,其中包含以下内容:

SELECT lh1.id, lh1.timestamp_detected, lh1.device_id FROM test as lh1, 
    (SELECT MAX(timestamp_detected) as max_timestamp_detected, device_id FROM test GROUP BY device_id) as lh2 
    WHERE lh1.timestamp_detected = lh2.max_timestamp_detected 
        AND lh1.device_id = lh2.device_id;

使用explain:

运行时会产生以下结果
+----+-------------+------------+-------+---------------------------------------------------------+------------------------------+---------+------------------------------------------+------+--------------------------+
| id | select_type | table      | type  | possible_keys                                           | key                          | key_len | ref                                      | rows | Extra                    |
+----+-------------+------------+-------+---------------------------------------------------------+------------------------------+---------+------------------------------------------+------+--------------------------+
|  1 | PRIMARY     | <derived2> | ALL   | NULL                                                    | NULL                         | NULL    | NULL                                     |   15 | Using where              |
|  1 | PRIMARY     | lh1        | ref   | FK_location_history_device,device_id_timestamp_detected | device_id_timestamp_detected | 9       | lh2.device_id,lh2.max_timestamp_detected |    1 | Using index              |
|  2 | DERIVED     | test       | range | FK_location_history_device,device_id_timestamp_detected | device_id_timestamp_detected | 4       | NULL                                     |   15 | Using index for group-by |
+----+-------------+------------+-------+---------------------------------------------------------+------------------------------+---------+------------------------------------------+------+--------------------------+

现在要求只在结果中包含distincted = 1的那些行。我将查询修改为以下内容:

SELECT lh1.id, lh1.timestamp_detected, lh1.device_id FROM test as lh1, 
        (SELECT MAX(timestamp_detected) as max_timestamp_detected, device_id FROM test WHERE distincted = 1 GROUP BY device_id) as lh2 
        WHERE lh1.timestamp_detected = lh2.max_timestamp_detected 
            AND lh1.device_id = lh2.device_id;

它会正确返回结果,但似乎需要更长时间。运行解释会产生以下结果:

+----+-------------+------------+-------+---------------------------------------------------------+------------------------------+---------+------------------------------------------+------+-------------+
| id | select_type | table      | type  | possible_keys                                           | key                          | key_len | ref                                      | rows | Extra       |
+----+-------------+------------+-------+---------------------------------------------------------+------------------------------+---------+------------------------------------------+------+-------------+
|  1 | PRIMARY     | <derived2> | ALL   | NULL                                                    | NULL                         | NULL    | NULL                                     |  860 | Using where |
|  1 | PRIMARY     | lh1        | ref   | FK_location_history_device,device_id_timestamp_detected | device_id_timestamp_detected | 9       | lh2.device_id,lh2.max_timestamp_detected |    1 | Using index |
|  2 | DERIVED     | test       | index | FK_location_history_device,device_id_timestamp_detected | FK_location_history_device   | 4       | NULL                                     |  860 | Using where |
+----+-------------+------------+-------+---------------------------------------------------------+------------------------------+---------+------------------------------------------+------+-------------+

我尝试将distincted列添加到索引idx2但无济于事。如何优化此查询?

1 个答案:

答案 0 :(得分:1)

查询是:

SELECT lh1.id, lh1.timestamp_detected, lh1.device_id
FROM test lh1 JOIN
     (SELECT MAX(timestamp_detected) as max_timestamp_detected, device_id
      FROM test
      WHERE distincted = 1
      GROUP BY device_id
     ) as lh2 
     on lh1.timestamp_detected = lh2.max_timestamp_detected AND
        lh1.device_id = lh2.device_id;

对于此查询,我建议在test(distincted, device_id, time_stamp_detected)test(device_id, timestamp_detected)上建立索引。

我也想知道你是否会通过这个等效查询获得更好的性能:

SELECT lh1.id, lh1.timestamp_detected, lh1.device_id
FROM test lh1
WHERE distincted = 1 AND
      NOT EXISTS (SELECT 1
                  FROM test t
                  WHERE t.distincted = 1 AND
                        t.device_id = lh1.device_id AND
                        t.timestamp_detected > lh1.timestamp_detected
                 );

这两个索引:test(distincted)test(device_id, timestamp_detected, distincted)