在下面的查询中,我要ORDER BY RAND()c表。当我将ORDER BY RAND()放入JOIN时,由于ORDER BY在GROUP BY之前运行,因此查询需要执行5秒钟以上。
UPDATE `backlinks` as a
JOIN (
SELECT b.`id` as bid
FROM `backlinks` b
WHERE b.`googlebot_id` IS NULL
AND b.`used_time` IS NULL
AND b.`campaign_id` IN (
SELECT `id` FROM `campaigns` WHERE `status`=true
)
GROUP BY b.`campaign_id`
) AS c ON a.id = c.bid
SET a.`crawler_id` = 'test'
limit 1;
答案 0 :(得分:1)
为什么要通过分组汇总功能使用分组方式
如果您只想为每个b设置一行。campaign_id
使用一些聚合函数可以避免其他列值的不可预测的结果以及db最新版本的错误
适当的聚合函数可以避免订单数量为1并限制为
为了提高性能,您可以避免为子查询使用IN子句,而使用内部联接可以产生相同的结果,但速度更快
UPDATE `backlinks` as a
JOIN(
SELECT min(b.`id`) as bid
FROM `backlinks` b
INNER JOIN (
SELECT `id`
FROM `campaigns`
WHERE `status`=true
) t1 on t1.id = b.`campaign_id`
WHERE b.`googlebot_id` IS NULL
AND b.`used_time` IS NULL
GROUP BY b.`campaign_id`
) AS c ON a.id = c.bid
SET a.`crawler_id` = 'test'
limit 1;
无论如何,如果您使用的是mysql prevoius版本,则5.7可以使用不具有聚集功能的分组依据..和..进行排序,但是..两者都会影响性能
UPDATE `backlinks` as a
JOIN(
SELECT b.`id` as bid
FROM `backlinks` b
INNER JOIN (
SELECT `id`
FROM `campaigns`
WHERE `status`=true
) t1 on t1.id = b.`campaign_id`
WHERE b.`googlebot_id` IS NULL
AND b.`used_time` IS NULL
GROUP BY b.`campaign_id`
) AS c ON a.id = c.bid
SET a.`crawler_id` = 'test'
limit 1;
提高性能的独特方式与使用join而不是IN子句以及表反向链接列campaign_id上的正确索引有关
您可以尝试在子查询外部但在适当的外部子查询内部使用rand和limit限制,并加入更新结果
UPDATE `backlinks` as a
INNER JOIN (
select a1.id
from backlinks as a1
INNER JOIN (
SELECT b.`id` as bid
FROM `backlinks` b
INNER JOIN (
SELECT `id`
FROM `campaigns`
WHERE `status`=true
) t1 on t1.id = b.`campaign_id`
WHERE b.`googlebot_id` IS NULL
AND b.`used_time` IS NULL
GROUP BY b.`campaign_id`
) AS c ON a1.id = c.bid
ORDER BY rand()
limit 1
) t on t.id = a.id