桌面上的MySQL查询23m行非常慢

时间:2017-10-28 17:24:34

标签: mysql query-optimization

我正在开发一个PHP Web服务,它需要在一个包含2300万条记录的表上执行查询。我创建的查询似乎需要30多秒才能完成,而且我可以告诉它是导致问题的查询部分的顺序,因为没有它,查询响应很快。

这是查询;

SELECT artist_feeds.*, artists.name, artists.picture AS profile_picture
FROM artist_feeds
INNER JOIN user_artists ON user_artists.artist_id = artist_feeds.artist_id
INNER JOIN artists ON artists.id = artist_feeds.artist_id
WHERE artist_feeds.feed_date >= '2015-10-01'
    AND user_artists.user_id = 486
    AND NOT EXISTS (
        SELECT id FROM user_artist_disabled_networks AS uadn
        WHERE uadn.user_id = 486
            AND uadn.artist_id = artist_feeds.artist_id
            AND uadn.socialnetwork_id = artist_feeds.socialnetwork_id
        LIMIT 1
        )
ORDER BY artist_feeds.feed_date DESC
LIMIT 0, 20

查询的解释如下所示;

enter image description here

任何人都可以提供任何指示吗?

根据要求,SHOW CREATE TABLE输出;

CREATE TABLE `artist_feeds` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `feed_id` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `feed_date` datetime DEFAULT NULL,
  `message` text COLLATE utf8mb4_unicode_ci,
  `hash` varchar(32) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `type` varchar(20) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `source` mediumtext COLLATE utf8mb4_unicode_ci,
  `picture` mediumtext COLLATE utf8mb4_unicode_ci,
  `link` mediumtext COLLATE utf8mb4_unicode_ci,
  `artist_id` int(11) DEFAULT '0',
  `socialnetwork_id` int(11) DEFAULT '0',
  `direct_link` mediumtext COLLATE utf8mb4_unicode_ci,
  `is_master_feed` tinyint(4) DEFAULT '0',
  `active` tinyint(4) DEFAULT '0',
  `created_at` datetime DEFAULT NULL,
  `updated_at` datetime DEFAULT NULL,
  `rss_feed_id` int(11) DEFAULT '0',
  PRIMARY KEY (`id`),
  KEY `artist_id` (`artist_id`),
  KEY `socialnetwork_id` (`socialnetwork_id`),
  KEY `feedidnetwork` (`feed_id`(191),`socialnetwork_id`),
  KEY `feeddatenetworkid` (`feed_date`,`socialnetwork_id`),
  KEY `feeddatenetworkidartistid` (`artist_id`,`socialnetwork_id`,`feed_date`),
  KEY `type` (`type`),
  KEY `feed_date` (`feed_date`)
) ENGINE=InnoDB AUTO_INCREMENT=26991713 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci

已解决:感谢来自Bill的指针,我研究了能够更改表中表访问的顺序,以便artist_feed表是第一个访问的表,这反过来将消除对数据的文件输入的需要,这会导致速度提升。

我最终使用STRAIGHT_JOIN而不是INNER JOIN。我的工作查询是;

SELECT af.*, a.name, a.picture AS profile_picture
FROM artist_feeds AS af
STRAIGHT_JOIN user_artists AS ua ON ua.artist_id = af.artist_id
STRAIGHT_JOIN artists AS a ON a.id = af.artist_id
LEFT OUTER JOIN user_artist_disabled_networks AS uadn
  ON uadn.user_id = ua.user_id AND uadn.socialnetwork_id = af.socialnetwork_id
WHERE af.feed_date >= '2015-10-01'
    AND uadn.user_id IS NULL
    AND ua.user_id = 498
ORDER BY af.feed_date DESC
LIMIT 0, 20

EXPLAIN现在看起来像这样;

enter image description here

1 个答案:

答案 0 :(得分:2)

我会使用排除连接而不是NOT EXISTS子查询来编写查询:

SELECT af.*, a.name, a.picture AS profile_picture
FROM artist_feeds AS af
INNER JOIN user_artists AS ua ON ua.artist_id = af.artist_id
INNER JOIN artists AS a ON a.id = af.artist_id
LEFT OUTER JOIN user_artist_disabled_networks AS uadn
  ON uadn.user_id = ua.user_id AND uadn.socialnetwork_id = af.socialnetwork_id
WHERE af.feed_date >= '2015-10-01'
  AND ua.user_id = 486
  AND uadn.user_id IS NULL
ORDER BY af.feed_date DESC
LIMIT 0, 20

根据EXPLAIN,表访问顺序为:

  1. ua按user_id查找
  2. a通过PRIMARY KEY查找
  3. af按artist_id查找,范围条件按feed_date
  4. 查找
  5. uadn通过user_id和socialnetwork_id查找
  6. 所以你应该有索引:

    • user_artists(user_id,artist_id)
    • 艺术家只需要PRIMARY KEY
    • artist_feeds(artist_id,feed_date)
    • user_artist_disabled_networks(user_id,socialnetwork_id)

    您的查询性能问题的很大一部分无疑是 Temp表,filesort 。这是不可避免的,因为您的查询不会首先访问artist_feeds表。

    在您的问题中重新更新:

    覆盖优化程序的表访问顺序并不是一个好主意。您可以看到它首先强制它读取af表,现在它必须检查该表中的1119万个条目。至少它能够避免手动排序结果 - 它可以依赖于af表的自然顺序。但在这种情况下,我不确定这是一个很好的权衡。