从子表

时间:2017-07-20 22:41:37

标签: mysql performance query-optimization greatest-n-per-group

我有一张包含数百万条记录的表,关系是一个(对象)到多个(object_items)。

object_items:

CREATE TABLE `object_items` (
  `item_name` varchar(50) NOT NULL DEFAULT '',
  `object_id` int(10) unsigned NOT NULL DEFAULT '0',
  `sequence` int(10) unsigned NOT NULL,
  `completed` tinyint(1) NOT NULL DEFAULT '0',
  `is_active` tinyint(1) NOT NULL DEFAULT '0',
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  PRIMARY KEY (`id`),
  UNIQUE KEY `uni_seq_object_id` (`sequence`,`object_id`),
  KEY `idx_object_id` (`object_id`),
  KEY `idx_seq` (`sequence`)
) ENGINE=InnoDB AUTO_INCREMENT=3408237 DEFAULT CHARSET=utf8mb4  

示例数据:

+-----------+-----------+----------+-----------+------+
| item_name | object_id | sequence | completed | id   |
+-----------+-----------+----------+-----------+------+
| ABCD      |        10 |        1 |         1 |    1 |
| BCDE      |        10 |        2 |         1 |    2 |
| CDEF      |        10 |        3 |         1 |    3 |
| DEFG      |        10 |        4 |         0 |    4 |
| ABCD      |        11 |        1 |         1 |    5 |
| BCDE      |        11 |        2 |         1 |    6 |
| CDEF      |        11 |        3 |         0 |    7 |
| DEFG      |        11 |        4 |         0 |    8 |
| ABCD      |        12 |        1 |         1 |    9 |
| BCDE      |        12 |        2 |         1 |   10 |
+-----------+-----------+----------+-----------+------+

期望的结果:

+-----------+-----------+----------+-----------+------+
| item_name | object_id | sequence | completed | id   |
+-----------+-----------+----------+-----------+------+
| DEFG      |        10 |        4 |         0 |    4 |
| CDEF      |        11 |        3 |         0 |    7 |
+-----------+-----------+----------+-----------+------+

我运行的查询:

select
  a.*
from object_items a
where a.sequence = (
  select min(sequence)
  from object_items b
  where a.object_id = b.object_id
    and b.completed = 0
)

哪个实际有效,但是当我使用限制时,但是如果我运行count(*)它就会死掉。

解释查询:

+----+--------------------+-------+------------+------+---------------+---------------+---------+-----------------+---------+----------+-------------+
| id | select_type        | table | partitions | type | possible_keys | key           | key_len | ref             | rows    | filtered | Extra       |
+----+--------------------+-------+------------+------+---------------+---------------+---------+-----------------+---------+----------+-------------+
|  1 | PRIMARY            | a     | NULL       | ALL  | NULL          | NULL          | NULL    | NULL            | 3268598 |   100.00 | Using where |
|  2 | DEPENDENT SUBQUERY | b     | NULL       | ref  | idx_object_id | idx_object_id | 4       | db.a.object_id  |      21 |    10.00 | Using where |
+----+--------------------+-------+------------+------+---------------+---------------+---------+-----------------+---------+----------+-------------+

有没有更好的方法来获取下一个尚未完成的TODO项目,按顺序,仅针对那些至少有一项要完成的对象,这样的重型数据库?

由于

1 个答案:

答案 0 :(得分:1)

SELECT a.*
FROM (
    SELECT object_id, MIN(sequence) AS sequence
    FROM object_items b
    WHERE b.completed = 0
    GROUP BY object_id
) AS m
INNER JOIN object_items a
  USING (object_id, sequence)

在列(completed, object_id, sequence)上添加索引以优化子查询。

在列(object_id, sequence)上添加索引以优化外部查询。