使用多个连接和大型记录集时的SQL查询优化

时间:2012-03-14 02:04:27

标签: mysql sql performance optimization database-optimization

我正在制作留言板,我试图检索常规主题(即没有粘贴的主题),并按最后发布消息的日期对其进行排序。我能够实现这一点,但是当我有大约10,000条消息和1500个主题时,查询时间大于60秒。

我的问题是,我的查询有什么可以提高性能的,还是我的设计存在根本缺陷?

这是我正在使用的查询。

SELECT Messages.topic_id,
       Messages.posted,
       Topics.title,
       Topics.user_id,
       Users.username
FROM Messages
LEFT JOIN
  Topics USING(topic_id)
LEFT JOIN
   Users on Users.user_id = Topics.user_id
WHERE Messages.message_id IN (
    SELECT MAX(message_id)
    FROM Messages
    GROUP BY topic_id)
AND Messages.topic_id
NOT IN (
    SELECT topic_id
    FROM StickiedTopics)
AND Messages.posted IN (                           
    SELECT MIN(posted)
    FROM Messages 
    GROUP BY message_id)
AND Topics.board_id=1
ORDER BY Messages.posted DESC LIMIT 50

编辑以下是解释计划

+----+--------------------+----------------+----------------+------------------+----------+---------+-------------------------+------+----------------------------------------------+
| id | select_type        | table          | type           | possible_keys    | key      | key_len | ref                     | rows | Extra                                        |
+----+--------------------+----------------+----------------+------------------+----------+---------+-------------------------+------+----------------------------------------------+
|  1 | PRIMARY            | Topics         | ref            | PRIMARY,board_id | board_id | 4       | const                   |  641 | Using where; Using temporary; Using filesort |
|  1 | PRIMARY            | Users          | eq_ref         | PRIMARY          | PRIMARY  | 4       | spergs3.Topics.user_id  |    1 |                                               |
|  1 | PRIMARY            | Messages       | ref            | topic_id         | topic_id | 4       | spergs3.Topics.topic_id |    3 | Using where                                   |
|  4 | DEPENDENT SUBQUERY | Messages       | index          | NULL             | PRIMARY  | 8       | NULL                    |    1 |                                              |
|  3 | DEPENDENT SUBQUERY | StickiedTopics | index_subquery | topic_id         | topic_id | 4       | func                    |    1 | Using index                                  |
|  2 | DEPENDENT SUBQUERY | Messages       | index          | NULL             | topic_id | 4       | NULL                    |    3 | Using index                                  |
+----+--------------------+----------------+----------------+------------------+----------+---------+-------------------------+------+----------------------------------------------+

索引

+----------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table    | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+----------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Messages |          0 | PRIMARY  |            1 | message_id  | A         |        9956 |     NULL | NULL   |      | BTREE      |         |
| Messages |          0 | PRIMARY  |            2 | revision_no | A         |        9956 |     NULL | NULL   |      | BTREE      |         |
| Messages |          1 | user_id  |            1 | user_id     | A         |         432 |     NULL | NULL   |      | BTREE      |         |
| Messages |          1 | topic_id |            1 | topic_id    | A         |        3318 |     NULL | NULL   |      | BTREE      |         |
+----------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+

+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table  | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Topics |          0 | PRIMARY  |            1 | topic_id    | A         |        1205 |     NULL | NULL   |      | BTREE      |         |
| Topics |          1 | user_id  |            1 | user_id     | A         |         133 |     NULL | NULL   |      | BTREE      |         |
| Topics |          1 | board_id |            1 | board_id    | A         |           1 |     NULL | NULL   |      | BTREE      |         |
+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+

+-------+------------+-----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name        | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-------+------------+-----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Users |          0 | PRIMARY         |            1 | user_id     | A         |        2051 |     NULL | NULL   |      | BTREE      |         |
| Users |          0 | username_UNIQUE |            1 | username    | A         |        2051 |     NULL | NULL   |      | BTREE      |         |
+-------+------------+-----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+

2 个答案:

答案 0 :(得分:2)

我会先从合格主题的第一个基础开始,获取这些ID,然后再加入。 我的内部第一个查询执行按topic_id和max消息分组的预先限定,以获得预先限定的不同ID。我也对stickiesTopics应用了LEFT JOIN。为什么?通过执行左连接,我可以查找那些FOUND(您要排除的那些)。所以我已经为Stickies主题ID应用了一个WHERE子句为NULL(即:未找到)。因此,通过执行此操作,我们已经ALREADY在列表中显着配对而不执行多个嵌套子查询。从这个结果,我们可以加入消息,主题(包括board_id = 1的限定符),用户和根据需要获取部分。最后,为MIN(已发布)限定符应用单个WHERE IN子选择。不要理解它的基础,但将其作为原始查询的一部分。然后按顺序和限制。

SELECT STRAIGHT_JOIN
      M.topic_id,
      M.posted,
      T.title,
      T.user_id,
      U.username
   FROM 
      ( select 
              M1.Topic_ID, 
              MAX( M1.Message_id ) MaxMsgPerTopic
           from 
              Messages M1
                 LEFT Join StickiedTopics ST
                    ON M1.Topic_ID = ST.Topic_ID
           where
              ST.Topic_ID IS NULL
           group by 
              M1.Topic_ID ) PreQuery
        JOIN Messages M
           ON PreQuery.MaxMsgPerTopic = M.Message_ID
           JOIN Topics T
               ON M.Topic_ID = T.Topic_ID
              AND T.Board_ID = 1
              LEFT JOIN Users U
                 on T.User_ID = U.user_id 
   WHERE
      M.posted IN ( SELECT MIN(posted)
                       FROM Messages 
                       GROUP BY message_id)
   ORDER BY 
      M.posted DESC 
   LIMIT 50

答案 1 :(得分:1)

我猜你问题的很大一部分在于你的子查询。尝试这样的事情:

SELECT Messages.topic_id,
       Messages.posted,
       Topics.title,
       Topics.user_id,
       Users.username
FROM Messages
LEFT JOIN
    Topics USING(topic_id)
LEFT JOIN
    StickiedTopics ON StickiedTopics.topic_id = Topics.topic_id 
                   AND StickedTopics.topic_id IS NULL
LEFT JOIN
    Users on Users.user_id = Topics.user_id
WHERE Messages.message_id IN (
    SELECT MAX(message_id)
    FROM Messages m1
    WHERE m1.topic_id = Messages.topic_id)
AND Messages.posted IN (                           
    SELECT MIN(posted)                                                                                           
    FROM Messages m2
    GROUP BY message_id)
AND Topics.board_id=1
ORDER BY Messages.posted DESC LIMIT 50

我通过删除分组来优化第一个子查询。第二个子查询是不必要的,因为它可以用JOIN替换。

我不太确定第三个子查询应该做什么:

AND Messages.posted IN (                           
    SELECT MIN(posted)                                                                                           
    FROM Messages m2
    GROUP BY message_id)

如果我知道它应该做什么,我或许可以帮助优化它。究竟是什么posted - 日期,整数等?它代表什么?