使用HAVING子句查询速度慢 - 我可以加快速度吗?

时间:2017-05-17 11:19:46

标签: mysql performance having

我有以下查询产生预期结果,但速度非常慢(大约需要10秒。在我的开发环境中,gstats表有大约130k行,并且生产量要大得多):

SELECT count(d.id) AS dcount, s.id, s.name
FROM sites s
LEFT JOIN deals d ON (s.id = d.site_id AND d.is_active = 1)
WHERE (s.is_active = 1)
AND s.id IN(
    SELECT g.site_id
    FROM gstats g
    WHERE g.start_date > '2015-04-30'
    GROUP BY g.site_id
    HAVING SUM(g.results) > 100
)
GROUP BY s.id
ORDER BY dcount ASC

我做错了吗?我怎样才能加快速度呢?

使用视图帮助添加索引吗?

3 个答案:

答案 0 :(得分:1)

快速解决方法是在子查询中过滤

SELECT count(d.id) AS dcount, s.id, s.name
FROM sites s
LEFT JOIN deals d ON (s.id = d.site_id AND d.is_active = 1)
WHERE (s.is_active = 1)
AND s.id IN(
    SELECT g.site_id
    FROM gstats g
    WHERE g.start_date > '2015-04-30' AND g.site_id = s.id
    GROUP BY g.site_id
    HAVING SUM(g.results) > 100
)
GROUP BY s.id
ORDER BY dcount ASC

否则,您会为每个可能的候选人执行此类分组查询。我们可以使用EXISTS

使其更加优雅
SELECT count(d.id) AS dcount, s.id, s.name
FROM sites s
LEFT JOIN deals d ON (s.id = d.site_id AND d.is_active = 1)
WHERE (s.is_active = 1)
AND EXISTS (
    SELECT 1
    FROM gstats g
    WHERE g.site_id = s.id AND g.start_date > '2015-04-30'
    HAVING SUM(g.results) > 100
)
GROUP BY s.id
ORDER BY dcount ASC

但我们还没有完成,现在我们将EXISTS用于每个元素。这很奇怪,因为查询仅取决于s.id,因此它仅取决于,而不是单个行。所以潜在的加速,但这取决于表的大小等,是将条件移动到HAVING语句:

SELECT count(d.id) AS dcount, s.id, s.name
FROM sites s
LEFT JOIN deals d ON (s.id = d.site_id AND d.is_active = 1)
WHERE (s.is_active = 1)
GROUP BY s.id
ORDER BY dcount ASC
HAVING EXISTS (
    SELECT 1
    FROM gstats g
    WHERE g.site_id = s.id AND g.start_date > '2015-04-30'
    HAVING SUM(g.results) > 100
)

答案 1 :(得分:1)

尝试将子查询移动到FROM子句:

SELECT count(d.id) AS dcount, s.id, s.name
FROM sites s JOIN
     (SELECT g.site_id
      FROM gstats g
      WHERE g.start_date > '2015-04-30'
      GROUP BY g.site_id
      HAVING SUM(g.results) > 100
     ) g
     ON g.site_id = s.site_id LEFT JOIN
     deals d
     ON s.id = d.site_id AND d.is_active = 1
WHERE s.is_active = 1
GROUP BY s.id
ORDER BY dcount ASC;

我假设您在join列上有索引。您可能还会发现这有助于提高性能:

SELECT s.id, s.name,
       (SELECT COUNT(*)
        FROM deals d
        WHERE d.site_id = s.id AND d.is_active = 1
       ) as dcount
FROM sites s JOIN
     (SELECT g.site_id
      FROM gstats g
      WHERE g.start_date > '2015-04-30'
      GROUP BY g.site_id
      HAVING SUM(g.results) > 100
     ) g
     ON g.site_id = s.site_id 
WHERE s.is_active = 1
ORDER BY dcount ASC;

对于此版本,您需要deals(site_id, is_active)上的索引。

答案 2 :(得分:0)

查询看起来很好。我建议使用以下索引:

create index idx_gstats on gstats(start_date, results, site_id);
create index idx_deals1 on deals(is_active, site_id);
create index idx_deals2 on deals(site_id, is_active);

然后查看查询的执行计划并删除未使用的交易索引。