如何选择索引

时间:2015-05-11 17:30:43

标签: sql indexing mariadb

我正在尝试多次执行以下查询:

SELECT st2.stop_id AS to_stop_id,
       TIME_TO_SEC(
       ADDTIME(TIMEDIFF(MIN(st1.time), %time),
       TIMEDIFF(st2.time, st2.time))) AS duration
FROM   stop_times st1,
       stop_times st2,
       trips tr,
       calendar cal
WHERE  tr.service_id   = cal.service_id
  AND  tr.trip_id      = st1.trip_id
  AND  st1.trip_id     = st2.trip_id
  AND  st1.stop_id     = %sid
  AND  st1.stop_seq +1 = st2.stop_seq
  AND  st1.time        > %time
  AND  DATE(NOW()) BETWEEN cal.start_date AND
  cal.end_date
GROUP BY st2.stop_id

然而,它运行得非常慢。我索引了以下属性:

+------------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table      | Non_unique | Key_name   | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+------------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| stop_times |          0 | st_id      |            1 | st_id       | A         |    11431583 |     NULL | NULL   |      | BTREE      |         |               |
| stop_times |          1 | fk_tid_s   |            1 | trip_id     | A         |     1039234 |     NULL | NULL   | YES  | BTREE      |         |               |
| stop_times |          1 | st_per_sid |            1 | stop_id     | A         |       33135 |     NULL | NULL   | YES  | BTREE      |         |               |
| calendar   |          0 | PRIMARY    |            1 | service_id  | A         |        5206 |     NULL | NULL   |      | BTREE      |         |               |
| calendar   |          0 | PRIMARY    |            1 | service_id  | A         |        5206 |     NULL | NULL   |      | BTREE      |         |               |
| trips      |          0 | PRIMARY    |            1 | trip_id     | A         |      449489 |     NULL | NULL   |      | BTREE      |         |               |
| trips      |          1 | fk_rid     |            1 | route_id    | A         |        1937 |     NULL | NULL   | YES  | BTREE      |         |               |
| trips      |          1 | fk_sid     |            1 | service_id  | A         |        7749 |     NULL | NULL   | YES  | BTREE      |         |               |
+------------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

(由于某些原因,st_id并未显示为PRIMARY KEY,但它是,我不知道它是否重要,但以防万一..)

我在这个查询上运行了SQL EXPLAIN,它给了我以下答案:

+------+-------------+-------+--------+-------------------------------------------------+---------------------+---------+------------------------------+------+---------------------------------------------------------------------+
| id   | select_type | table | type   | possible_keys                                   | key                 | key_len | ref                          | rows | Extra                                                               |
+------+-------------+-------+--------+-------------------------------------------------+---------------------+---------+------------------------------+------+---------------------------------------------------------------------+
|    1 | SIMPLE      | st1   | range  | comp_uniq_st_seq,st_per_sid,comp_uniq_stid_time | comp_uniq_stid_time | 9       | NULL                         | 1396 | Using index condition; Using where; Using temporary; Using filesort |
|    1 | SIMPLE      | tr    | eq_ref | PRIMARY,fk_sid                                  | PRIMARY             | 8       | reseau_ratp.st1.trip_id      |    1 | Using where                                                         |
|    1 | SIMPLE      | cal   | eq_ref | PRIMARY,comp_sid_date_en,comp_sid_date_st       | PRIMARY             | 4       | reseau_ratp.tr.service_id    |    1 | Using where                                                         |
|    1 | SIMPLE      | st2   | ref    | comp_uniq_st_seq                                | comp_uniq_st_seq    | 14      | reseau_ratp.st1.trip_id,func |    1 | Using index condition                                               |
+------+-------------+-------+--------+-------------------------------------------------+---------------------+---------+------------------------------+------+---------------------------------------------------------------------+

我该怎么做才能让这个查询更快地运行?

编辑: 使用请求的语法进行查询:

SELECT st2.stop_id AS to_stop_id,
       TIME_TO_SEC(
       ADDTIME(TIMEDIFF(MIN(st1.time), %time),
       TIMEDIFF(st2.time, st2.time))) AS duration

FROM   stop_times st1
  INNER JOIN stop_times st2
          ON st1.trip_id = st2.trip_id AND st1.stop_seq + 1 = st2.stop_seq
  INNER JOIN trips tr
          ON tr.trip_id = st1.trip_id
  INNER JOIN calendar cal
          ON tr.service_id = cal.service_id

WHERE  st1.stop_id     =  %sid
  AND  st1.time        >  %time
  AND  cal.start_date  <= NOW()
  AND  cal.end_date    >= NOW()

GROUP BY st2.stop_id

此处SHOW CREATE TABLE stop_times

CREATE TABLE `stop_times` (
  `trip_id` bigint(10) unsigned DEFAULT NULL,
  `stop_id` int(10) DEFAULT NULL,
  `time` time DEFAULT NULL,
  `stop_seq` int(10) unsigned DEFAULT NULL,
  UNIQUE KEY `comp_uniq_st_seq` (`trip_id`,`stop_seq`),
  KEY `comp_uniq_stid_time` (`stop_id`,`time`),
  CONSTRAINT `fk_sid_s` FOREIGN KEY (`stop_id`) REFERENCES `stops` (`stop_id`),
  CONSTRAINT `fk_tid_s` FOREIGN KEY (`trip_id`) REFERENCES `trips` (`trip_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

calendar

CREATE TABLE `calendar` (
  `service_id` int(10) unsigned NOT NULL,
  `start_date` date DEFAULT NULL,
  `end_date` date DEFAULT NULL,
  PRIMARY KEY (`service_id`),
  KEY `comp_sid_date_en` (`service_id`,`end_date`),
  KEY `comp_sid_date_st` (`service_id`,`start_date`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

对于trips

CREATE TABLE `trips` (
  `trip_id` bigint(10) unsigned NOT NULL DEFAULT '0',
  `route_id` int(10) unsigned DEFAULT NULL,
  `service_id` int(10) unsigned DEFAULT NULL,
  `trip_headsign` varchar(15) DEFAULT NULL,
  `trip_short_name` varchar(15) DEFAULT NULL,
  `direction_id` tinyint(1) DEFAULT NULL,
  PRIMARY KEY (`trip_id`),
  KEY `fk_rid` (`route_id`),
  KEY `fk_sid` (`service_id`),
  CONSTRAINT `fk_rid` FOREIGN KEY (`route_id`) REFERENCES `routes` (`route_id`),
  CONSTRAINT `fk_sid` FOREIGN KEY (`service_id`) REFERENCES `calendar` (`service_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

1 个答案:

答案 0 :(得分:1)

st1需要此综合索引:INDEX(stop_id, time)

请使用JOIN ... ON语法。

请提供SHOW CREATE TABLE

这是Cookbook on creating INDEXes from a SELECT

(编辑)

Calendar处理起来比较棘手,没有好的&#34;指数。这些可能有所帮助:

INDEX(service_id, start_time)
INDEX(service_id, end_time)

加上,将AND DATE(NOW()) BETWEEN cal.start_date AND cal.end_date改为

AND cal.start_date <= NOW()
AND cal.end_time   >= NOW()

(编辑2)

只要可行,请说NOT NULL。这在stop_times没有PRIMARY KEY时尤为重要。将UNIQUE KEY comp_uniq_st_seqtrip_idstop_seq)中的两列更改为NOT NULL 将其转换为PRIMARY KEY (trip_id, stop_seq)。这将使得PK的性能优势与数据&#34;踢进去。

现在我看到了CREATE TABLE的日历,而service_idPRIMARY KEY,我建议的两个索引可能没用。 (同样,这涉及&#34;聚类&#34;。)

我的Cookbook for building indexes可能会派上用场。