优化连接查询中的order by子句

时间:2013-10-05 17:18:24

标签: mysql sql database database-design

我需要帮助来优化此查询。

  SELECT messages.*
   FROM messages
   INNER JOIN subscription ON subscription.entity_id = messages.entity_id
   WHERE subscription.user_id = 1
   ORDER BY messages.timestamp DESC 
   LIMIT 50

没有限制,此查询返回200K行,运行大约需要1.3到2秒。问题似乎在order by子句中。没有它,查询需要.0005秒。

Indexes:
    ( subscription.user_id, subscription.entity_id )
    ( subscription.entity_id )
    ( messages.timestamp )
    ( messages.entity_id, messages.timestamp )

我可以通过将查询更改为:

来提高性能
SELECT messages.* FROM messages
INNER JOIN subscription ON subscription.entity_id = messages.entity_id 
INNER JOIN ( 
   SELECT message_id FROM messages ORDER BY timestamp DESC
) as temp on temp.messsage_id = messages.message_id
WHERE subscription.user_id = 1 LIMIT 50

这在.12秒内运行。一个非常好的改进,但我想知道它是否会更好。它似乎 如果我能以某种方式过滤第二个内连接,那么事情会更快。

感谢。

SCHEMA:

   messages 
      message_id, entity_id, message, timestamp

   subscription
      user_id, entity_id

更新

Raymond Nijland的答案解决了我最初的问题,但另一个问题刚刚出现

 SELECT messages.*
   FROM messages
   STRAIGHT_JOIN subscription ON subscription.entity_id = messages.entity_id
   WHERE subscription.user_id = 1
   ORDER BY messages.timestamp DESC 
   LIMIT 50

直接连接在两种情况下效率低下:

  1. 订阅表中没有user_id条目

  2. 消息表中的相关条目很少

  3. 有关如何解决此问题的任何建议?如果不是从查询的角度来看,一个应用程序呢?

    更新

    EXPLAIN INFO

    限制50

    | id | select_type | table             | type   | possible_keys                           | key           | key_len | ref                                    | rows | Extra       |
    |  1 | SIMPLE      | messages          | index  | idx_timestamp                           | idx_timestamp | 4       | NULL                                   |   50 |             |
    |  1 | SIMPLE      | subscription      | eq_ref | PRIMARY,entity_id,user_id               | PRIMARY       | 16      | const, messages.entity_id              |    1 | Using index |
    

    无限制

    | id | select_type | table             | type   | possible_keys                           | key           | key_len | ref                                    |   rows   | Extra         |
    |  1 | SIMPLE      | messages          | ALL    | entity_id_2,entity_id                   | NULL          | NULL    | NUL                                    |   255069 | Using filesort|
    |  1 | SIMPLE      | subscription      | eq_ref | PRIMARY,entity_id,user_id               | PRIMARY       | 16      | const, messages.entity_id              |        1 | Using index   |
    

    创建表语句:

    〜5000行

    subscription | CREATE TABLE `subscription` (
      `user_id`   bigint(20) unsigned NOT NULL,
      `entity_id` bigint(20) unsigned NOT NULL,
      PRIMARY KEY (`user_id`,`entity_id`),
      KEY `entity_id` (`entity_id`)
    ) ENGINE=InnoDB DEFAULT CHARSET=utf8
    

    〜255,000行

    messages | CREATE TABLE `messages` (
      `message_id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
      `entity_id` bigint(20) unsigned NOT NULL,
      `message` varchar(255) NOT NULL DEFAULT '',
      `timestamp` int(10) unsigned NOT NULL,
      PRIMARY KEY (`message_id`),
      KEY `entity_id` (`entity_id`,`timestamp`),
      KEY `idx_timestamp` (`timestamp`)
    ) ENGINE=InnoDB DEFAULT CHARSET=utf8 
    

1 个答案:

答案 0 :(得分:3)

删除索引messages.entity_id这个是Redundant并尝试一个straight_join我认为mysql优化器正在以错误的顺序访问你的表。 MySQL需要首先访问表消息,以便它可以使用消息索引(entity_id,timestamp)并删除“使用临时;使用filesort”的需要(如果MySQL需要创建基于MyISAM磁盘的表并且需要sort(quicksort algoritm)这与磁盘I / O读取和I / O写入。)

 SELECT STRAIGHT_JOIN messages.*
   FROM messages
   INNER JOIN subscription ON subscription.entity_id = messages.entity_id
   WHERE subscription.user_id = 1
   ORDER BY messages.timestamp DESC 
   LIMIT 50

OR

 SELECT messages.*
   FROM messages
   STRAIGHT_JOIN subscription ON subscription.entity_id = messages.entity_id
   WHERE subscription.user_id = 1
   ORDER BY messages.timestamp DESC 
   LIMIT 50

我也遇到过这个问题,我就像http://sqlfiddle.com/#!2/b34870/1一样修复了它,但随后又修了国家/城市表

编辑,因为关于Jason M对STRAIGHT_JOIN的反应

直接连接在两种情况下效率低下:

订阅表中没有user_id条目

实际上,使用INNER JOIN的MySQL优化器会触发“在读取const表后注意到不可能的WHERE”,并且永远不会执行查询。 但是STRAIGHT_JOIN不会触发“在读取const表之后注意到的不可能”,因此需要进行(可能是完整的)索引扫描以找到可能减慢查询执行速度的user_id值。 简单修复就是:将现有的user_id用于STRAIGHT_JOIN

消息表中的相关条目很少

这里可能出现同样的问题MySQL认为它应该进行一个(可能是完整的)索引扫描来查找结果。但我需要看到一个EXPLAIN声明,以确定

您可能还想先尝试此查询

SELECT 
 *
FROM (

 SELECT
  entity_id

 FROM
  subscriptions

 WHERE
  subscription.user_id = 1 
)
 subscriptions

INNER JOIN 
 messages

ON
 subscriptions.entity_id = messages.entity_id

ORDER BY
 messages.timestamp DESC

LIMIT 50