我在优化了一个执行相同操作的php脚本之后编写了此查询,但是同时使用了3个不同的查询和2个循环...而这个php脚本运行了6个小时以上... 因此,我将所有内容压缩到一个简单的查询中,将其压缩到同一作业中而没有任何循环...
DELETE table FROM table WHERE id IN (
SELECT id from(
SELECT MAX(data_elab) as data_elab_new, count(*) as volte,t1.* FROM (
SELECT * from table ORDER BY data_elab DESC
)t1
group by cod_dl,issn,variante,add_on having volte>1
)t2
);
注意:服务器非常旧(Windows,3GB内存,32位),表大小204 MB,100.000行,20列,仅id是主键,没有索引。
此查询只用了20秒...删除是问题所在......
SELECT id from(
SELECT MAX(data_elab) as data_elab_new, count(*) as volte,t1.* FROM (
SELECT * from table ORDER BY data_elab DESC
)t1
group by cod_dl,issn,variante,add_on having volte>1
)t2
问题是我想大大加快操作速度,但实际上两个多小时后查询仍未完成,并继续有效...
是否有优化此查询的建议,或者我在查询中做错了什么?
谢谢。
答案 0 :(得分:0)
假设data_elab
对于cod_dl, issn, variante, add_on
的任何组合都不会重复(我假设这是“ univocal”的意思),这是您需要的查询形式:
DELETE table
FROM table
WHERE (cod_dl, issn, variante, add_on, data_elab) IN (
SELECT cod_dl, issn, variante, add_on, MAX(data_elab) as data_elab_max
FROM table
GROUP BY cod_dl, issn, variante, add_on
HAVING COUNT(*) > 1
);
由于MySQL不太喜欢在查询中从同一表进行DELETEing和SELECTing,因此您可能需要进行一些调整,例如:
DELETE table
FROM table
WHERE (cod_dl, issn, variante, add_on, data_elab) IN (
SELECT extraLayerOfIndirection.*
FROM (
SELECT cod_dl, issn, variante, add_on, MAX(data_elab) as data_elab_max
FROM table
GROUP BY cod_dl, issn, variante, add_on
HAVING COUNT(*) > 1
) AS extraLayerOfIndirection
);
此外,不太一样,但是您可能需要考虑以下问题:
DELETE table
FROM table
WHERE (cod_dl, issn, variante, add_on, data_elab) NOT IN (
SELECT extraLayerOfIndirection.*
FROM (
SELECT cod_dl, issn, variante, add_on, MIN(data_elab) as data_elab_max
FROM table
GROUP BY cod_dl, issn, variante, add_on
) AS extraLayerOfIndirection
);
这不仅删除了每个分组的最后一个分组,而且还删除了每个分组中除第一个分组之外的所有分组。如果您有很多重复,并且只想保留每个分组的第一个分组,那么可能会导致子查询的结果小得多。