Question

此查询非常简单，我想要做的就是获取last_updated字段排序的给定类别中的所有文章：

SELECT
    `articles`.*
FROM
    `articles`,
    `articles_to_categories`
WHERE
        `articles`.`id` = `articles_to_categories`.`article_id`
        AND `articles_to_categories`.`category_id` = 1
ORDER BY `articles`.`last_updated` DESC
LIMIT 0, 20;

但它运行得很慢。这是EXPLAIN所说的：

select_type  table                   type     possible_keys           key         key_len  ref                                rows  Extra
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SIMPLE       articles_to_categories  ref      article_id,category_id  article_id  5        const                              5016  Using where; Using temporary; Using filesort
SIMPLE       articles                eq_ref   PRIMARY                 PRIMARY     4        articles_to_categories.article_id  1

有没有办法重写此查询或为我的PHP脚本添加其他逻辑以避免Using temporary; Using filesort并加快速度？

表格结构：

*articles*
id | title | content | last_updated

*articles_to_categories*
article_id | category_id

更新

我已将last_updated编入索引。我想我的情况在d ocumentation中解释：

在某些情况下，MySQL无法使用   用于解析ORDER BY的索引，   虽然它仍然使用索引来查找   与WHERE子句匹配的行。   这些案例包括以下内容：

用于获取行的键与ORDER BY中使用的键不同：   SELECT * FROM t1 WHERE key2 = constant ORDER BY key1;

你正在加入很多桌子，而且   ORDER BY中的列不是全部   来自第一个非常数表   用于检索行。（这是   EXPLAIN输出中的第一个表   没有const连接类型。）

但我仍然不知道如何解决这个问题。

Answer 1

这是一个简化的例子，我之前用过类似的性能相关问题，利用了innodb集群主键索引（显然只有innodb可用!!）

您有3个表：category，product和product_category，如下所示：

drop table if exists product;
create table product
(
prod_id int unsigned not null auto_increment primary key,
name varchar(255) not null unique
)
engine = innodb; 

drop table if exists category;
create table category
(
cat_id mediumint unsigned not null auto_increment primary key,
name varchar(255) not null unique
)
engine = innodb; 

drop table if exists product_category;
create table product_category
(
cat_id mediumint unsigned not null,
prod_id int unsigned not null,
primary key (cat_id, prod_id) -- **note the clustered composite index** !!
)
engine = innodb;

最重要的是product_catgeory集群复合主键的顺序，因为此场景的典型查询始终由（x，y，z ...）中的cat_id = x或cat_id引导。

我们有 500K 类别， 100万产品和 1.25亿产品类别。

select count(*) from category;
+----------+
| count(*) |
+----------+
|   500000 |
+----------+

select count(*) from product;
+----------+
| count(*) |
+----------+
|  1000000 |
+----------+

select count(*) from product_category;
+-----------+
| count(*)  |
+-----------+
| 125611877 |
+-----------+

因此，让我们看看此架构如何为类似于您的查询执行。所有查询都是冷的（在mysql重启之后）运行，空缓冲区没有查询缓存。

select
 p.*
from
 product p
inner join product_category pc on 
    pc.cat_id = 4104 and pc.prod_id = p.prod_id
order by
 p.prod_id desc -- sry dont a date field in this sample table - wont make any difference though
limit 20;

+---------+----------------+
| prod_id | name           |
+---------+----------------+
|  993561 | Product 993561 |
|  991215 | Product 991215 |
|  989222 | Product 989222 |
|  986589 | Product 986589 |
|  983593 | Product 983593 |
|  982507 | Product 982507 |
|  981505 | Product 981505 |
|  981320 | Product 981320 |
|  978576 | Product 978576 |
|  973428 | Product 973428 |
|  959384 | Product 959384 |
|  954829 | Product 954829 |
|  953369 | Product 953369 |
|  951891 | Product 951891 |
|  949413 | Product 949413 |
|  947855 | Product 947855 |
|  947080 | Product 947080 |
|  945115 | Product 945115 |
|  943833 | Product 943833 |
|  942309 | Product 942309 |
+---------+----------------+
20 rows in set (0.70 sec) 

explain
select
 p.*
from
 product p
inner join product_category pc on 
    pc.cat_id = 4104 and pc.prod_id = p.prod_id
order by
 p.prod_id desc -- sry dont a date field in this sample table - wont make any diference though
limit 20;

+----+-------------+-------+--------+---------------+---------+---------+------------------+------+----------------------------------------------+
| id | select_type | table | type   | possible_keys | key     | key_len | ref           | rows | Extra                                        |
+----+-------------+-------+--------+---------------+---------+---------+------------------+------+----------------------------------------------+
|  1 | SIMPLE      | pc    | ref    | PRIMARY       | PRIMARY | 3       | const           |  499 | Using index; Using temporary; Using filesort |
|  1 | SIMPLE      | p     | eq_ref | PRIMARY       | PRIMARY | 4       | vl_db.pc.prod_id |    1 |                                              |
+----+-------------+-------+--------+---------------+---------+---------+------------------+------+----------------------------------------------+
2 rows in set (0.00 sec)

那是0.70秒冷 - 哎哟。

希望这会有所帮助：）

修改

刚刚阅读了我对上述评论的回复，您似乎有以下两种选择之一：

create table articles_to_categories ( article_id int unsigned not null, category_id mediumint unsigned not null, primary key(article_id, category_id), -- good for queries that lead with article_id = x key (category_id) ) engine=innodb;

或

create table categories_to_articles ( article_id int unsigned not null, category_id mediumint unsigned not null, primary key(category_id, article_id), -- good for queries that lead with category_id = x key (article_id) ) engine=innodb;

取决于您关于如何定义群集PK的典型查询。

Answer 2

您应该可以通过在<{strong> articles.last_updated上添加密钥来避免使用文件排序。 MySQL需要用于ORDER BY操作的文件排序，但只要您通过索引列进行排序（有一些限制），就可以在没有filesort的情况下完成。

有关详情，请参阅此处：http://dev.mysql.com/doc/refman/5.0/en/order-by-optimization.html

Answer 3

我假设您已在db中进行了以下操作：

1）文章 - ＆gt; id是主键

2）articles_to_categories - ＆gt; article_id是文章的外键 - ＆gt; ID

3）您可以在category_id

上创建索引

Answer 4

ALTER TABLE articles ADD INDEX (last_updated);
ALTER TABLE articles_to_categories ADD INDEX (article_id);

应该这样做。正确的计划是使用第一个索引查找前几个记录，然后使用第二个索引执行JOIN。如果它不起作用，请尝试使用STRAIGHT_JOIN或其他方法来强制使用正确的索引。

如何在多对多查询中避免“使用临时”？

4 个答案: