无论如何都有优化这个MySQL查询?

时间:2012-10-08 10:30:32

标签: mysql join subquery

需要很长时间才能完成,但希望能够快速提取它收集的信息。

SELECT *
FROM releases
WHERE (artist IN (SELECT artist FROM artist_love WHERE user='Quickinho')
OR
label IN (SELECT label FROM label_love WHERE user='Quickinho')
OR
id IN (SELECT release_id FROM charts_extended WHERE artist IN (SELECT dj FROM dj_love WHERE user='Quickinho'))
OR
id IN (SELECT artist FROM releases WHERE id IN (SELECT release_id FROM charts_extended WHERE user='Quickinho'))
OR
id IN (SELECT label FROM releases WHERE id IN (SELECT release_id FROM charts_extended WHERE user='Quickinho')))
AND
id NOT IN (SELECT release_id FROM charts_extended WHERE user='Quickinho')
ORDER BY date DESC
LIMIT 0,102

8 个答案:

答案 0 :(得分:8)

避免任何子选择(虽然未经过测试,请原谅任何拼写错误)

SELECT *
FROM releases
LEFT OUTER JOIN artist_love ON releases.artist = artist_love.artist AND artist_love.user = 'Quickinho'
LEFT OUTER JOIN label_love ON releases.label = label_love.label AND label_love.user = 'Quickinho'
LEFT OUTER JOIN charts_extended ON releases.id = charts_extended.release_id
LEFT OUTER JOIN dj_love ON charts_extended.artist = dj_love.dj AND dj_love.user = 'Quickinho'
LEFT OUTER JOIN releases releases1 ON releases.id = releases1.artist
LEFT OUTER JOIN charts_extended charts_extended1 ON charts_extended1.artist = releases1.id AND charts_extended1.user = 'Quickinho'
LEFT OUTER JOIN releases releases2 ON releases.id = releases2.label
LEFT OUTER JOIN charts_extended charts_extended2 ON charts_extended2.artist = releases2.id AND charts_extended2.user = 'Quickinho'
LEFT OUTER JOIN charts_extended charts_extended3 ON charts_extended3.release_id = releases.id AND charts_extended3.user = 'Quickinho'
WHERE (artist_love.user IS NOT NULL
OR label_love.user IS NOT NULL
OR dj_love.user IS NOT NULL
OR charts_extended1.user IS NOT NULL
OR charts_extended2.user IS NOT NULL)
AND charts_extended3.user IS NULL

答案 1 :(得分:4)

其他人提供的优化查询可能还不够快。

假设您的原始查询执行时间为120秒,最佳优化查询仍需要30秒,但您需要5秒或更长时间的响应时间。你能做什么?

预填充!

运行由定期执行的 cron作业触发的查询,例如每小时。使用INSERT SELECT这样的查询:

INSERT INTO releases_queried
SELECT -- your query (your original one or one of the optimized ones)

MySQL Manual INSERT-SELECT。然后你会得到

的结果
SELECT * FROM releases_queried

立即在毫秒之内。这是一种众所周知的技术,可以缩短响应时间。如果查询所需的数据始终可用,它的效果很好。

真实世界使用

StackOverflow本身有很多复杂的查询没有按要求完成,但是是异步的。每次访问都不会计算徽章,而是由cron计算。

答案 2 :(得分:3)

...from releases
WHERE (artist IN (SELECT artist FROM artist_love WHERE user='Quickinho')

我建议您使用JOIN而不是IN (SELECT..)

您可以执行类似

的操作
select r.* from releases r, artist_love al 
where r.artist = al.artist and al.user='Quickinho'

答案 3 :(得分:2)

IN()和NOT IN()子查询优化不佳
MySQL将子查询作为外部查询中每一行的从属子查询执行。这是MySQL 5.5及更早版本中严重性能问题的常见原因。该查询可能应分别重写为JOIN或LEFT OUTER JOIN。

SELECT *

如果表的架构发生更改,选择带有*通配符的所有列将导致查询的含义和行为发生更改,并可能导致查询检索过多数据。

答案 4 :(得分:1)

首先 - 将JOIN关系中使用的所有字段编入索引。

然后尝试此查询 -

SELECT
  r.*
FROM
  releases r
LEFT JOIN (SELECT artist FROM artist_love WHERE user='Quickinho') al
  ON al.artist = r.artist
LEFT JOIN (SELECT label FROM label_love WHERE user='Quickinho') ll
  ON ll.label = r.label
LEFT JOIN (
    SELECT release_id FROM charts_extended ce
    INNER JOIN (SELECT dj FROM dj_love WHERE user='Quickinho') djl
      ON djl.dj = ce.artist
    ) ce
  ON r.id = ce.release_id
LEFT JOIN (
    SELECT artist FROM releases r
    INNER JOIN (SELECT release_id FROM charts_extended WHERE user='Quickinho') ce
      ON r.id = release_id
  ) r2
  ON r2.artist = r.id OR r2.label = r.id

LEFT JOIN (SELECT release_id FROM charts_extended WHERE user='Quickinho') ce2
  ON ce2.release_id = r.id

WHERE
  (al.artist IS NOT NULL OR ll.label IS NOT NULL OR ce.release_id IS NOT NULL OR r2.id IS NOT NULL)
  AND ce2.release_id IS NULL
GROUP BY
  r.id

答案 5 :(得分:1)

Kickstart的解决方案是正确的想法(虽然我建议你加入USER,如果可能的话,让“user ='Quickinho'”出现这么多次并不是好的做法),然后考虑添加索引到部分或全部以下字段:

  • artist_love.artist
  • label_love.label
  • charts_extended.release_id
  • dj_love.dj
  • releases.artist
  • releases.label
  • charts_extended.release_id

虽然我不能说我能想到你正试图用这个做什么。可能有更好的解决方案。

答案 6 :(得分:1)

您可以搜索key_cacheSQL Partitionperformance tuning;

答案 7 :(得分:1)

您可以使用JOIN来提高效果。在JOIN中,RDBMS可以创建一个更适合您的查询的执行计划,不像子查询,它将运行所有查询并加载所有数据以进行处理。