Question

我正在尝试优化此查询：

SELECT articles.id 
FROM articles 
INNER JOIN articles_authors ON articles.id=articles_authors.fk_Articles 
WHERE articles_authors.fk_Authors=586 
ORDER BY articles.publicationDate LIMIT 0,50;

表格文章：

引擎：MyISAM
Row_format：动态
行：1 482 588
Data_length：788 926 672
最大数据长度：281 474 976 710 655
索引长度：127 300 608
数据免费：0
checksum：null

    CREATE TABLE `articles` (
      `id` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
    `title` VARCHAR(255) NOT NULL,
    `publicationDate` DATE NOT NULL DEFAULT '1970-01-01',
    PRIMARY KEY (`id`),
    KEY `publicationDate` (`publicationDate`)
    ) ENGINE=MYISAM AUTO_INCREMENT=1498496 DEFAULT CHARSET=utf8

表格articles_authors：

引擎：MyISAM
Row_format：动态
行：1 970 750
Data_length：45 008 420
最大数据长度：281 474 976 710 655
索引长度：127 300 608
数据免费：0
checksum：null

    CREATE TABLE `articles_authors` (
    `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
    `fk_Articles` int(10) unsigned NOT NULL,
    `fk_Authors` int(10) unsigned NOT NULL,
    PRIMARY KEY (`id`),
    UNIQUE KEY `fk_Articles_fk_Authors` (`fk_Articles`,`fk_Authors`),
    KEY `fk_Articles` (`fk_Articles`),
    KEY `fk_Authors` (`fk_Authors`),
    ) ENGINE=MyISAM AUTO_INCREMENT=2349047 DEFAULT CHARSET=utf8

解释查询：

id (1), select_type(SIMPLE), TABLE(articles_authors), TYPE(ref), possible_keys(fk_Articles_fk_Authors, fk_Articles, fk_Authors), KEY (fk_Authors), Key_len(4), ref(const), ROWS(171568), extra (USING TEMPORARY; USING FILE sort)
id (1), select_type(SIMPLE), TABLE(articles), TYPE(eq_ref), possible_keys(PRIMARY), KEY (PRIMARY), Key_len(4), ref(articles_authors.fk_Authors), ROWS(1), extra ()

如您所见，SQL查询未经过优化（在解释中使用文件排序）。

感谢您的帮助！

Answer 1

使用索引，就像在解释中所说的那样。

id (1), select_type(SIMPLE), TABLE(articles_authors), TYPE(ref), possible_keys(fk_Articles_fk_Authors, fk_Articles, fk_Authors),
的 KEY (fk_Authors), Key_len(4) , ref(const), ROWS(171568), extra (USING TEMPORARY; USING FILE sort)

仅作为其选择的50行的额外，而不是按发布日期排序，它会执行文件排序。
它创建一个包含50个项目的临时表。然后用tablesort进行排序。
这个有这样做，因为MySQL不能在那些孤独的50个项目上使用大索引，在IO访问时间会花费很多。

在内存中对50个数字进行排序然后访问磁盘上的索引会更快。

您可以采取措施加快查询速度：

optimize table articles, articles_authors

并重新运行查询。

编辑：通过对表格文章进行非规范化来加快建议

如果您重写这样的查询：

SELECT articles.id FROM articles WHERE articles.id IN ( SELECT articles_authors.fk_articles WHERE articles_authors.fk_authors = 586 LIMIT 0,50 ) ORDER BY articles.publicationDate;

您可能会看到相同的性能，但它突出了问题。如果作者586有180,000篇文章，那么MySQL必须在articles_authors中搜索180k中的50个项目，然后在订单表中再次搜索180k中的50个项目。

如果你合并表article_authors和文章，你的表文章将被非规范化（假设一篇文章可以有多个作者）但是你不需要进行连接而你自己保存了第二个搜索范围。

CREATE TABLE `articles` ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT, `publicationDate` date NOT NULL DEFAULT '1970-01-01', `title` varchar(255) NOT NULL, `fk_Authors` int(10) unsigned NOT NULL, PRIMARY KEY (`id`), UNIQUE KEY `Articles_fk_Authors` (`id`,`fk_Authors`), KEY `fk_Authors` (`fk_Authors`), KEY `publicationDate` (`publicationDate`) ) ENGINE=MyISAM AUTO_INCREMENT=2349047 DEFAULT CHARSET=utf8

现在你可以这样选择

SELECT articles.id FROM articles WHERE articles.Author = 586 ORDER BY articles.publicationDate LIMIT 50,0

Answer 2

也许这会对你有所帮助：

SELECT articles.id 
    FROM articles 
        INNER JOIN (SELECT fk_Articles FROM articles_authors WHERE articles_authors.fk_Authors=586) sub ON articles.id=sub.fk_Articles 
ORDER BY articles.publicationDate LIMIT 0,50;

Answer 3

SELECT articles.id 
FROM articles 
INNER JOIN articles_authors ON articles.id=articles_authors.fk_Articles 
WHERE articles.id=586 
ORDER BY articles.publicationDate LIMIT 0,50;

Answer 4

不确定，但Conrad的建议似乎改变了排序和限制，因此您可能会按排序顺序获取随机列表的前50个项目，而不是排序列表的前50个项目。

如果视图由fk_author，publicationDate排序并且有索引，那么带有联接帮助的视图是否可以？还取决于您的优化，速度还是磁盘空间？

你可以在Mysql中使用IN吗？它可能更好地优化吗？（示例代码，未选中）

SELECT id FROM articles WHERE id IN 
(SELECT fk_Articles FROM articles_authors WHERE fk_Authors=586) as IDs
ORDER BY publicationDate LIMIT 0,50;

Answer 5

这实际上可能有效，具体取决于您的数据。

SELECT articles.id 
FROM articles 
INNER JOIN articles_authors ON articles.id=articles_authors.fk_Articles 
WHERE articles_authors.fk_Authors=586 
ORDER BY articles.publicationDate LIMIT 0,50;

如果articles_authors.fk_Authors = 586根据您的数据库引擎收集的统计信息导致相当罕见的行，则获取所有内容并获取前50行会更便宜。

相比之下，如果它导致大多数文章，那么查询articles.publicationDate上的索引并过滤掉无效行会更便宜，直到您获得所请求的50行。

Mysql查询：文件排序时内连接，限制和排序依据

5 个答案: