我现在有一张有604 000行的表。我想删掉4000个随机行,所以我的表只包含60万个条目。
会有快速的方法吗?
非常感谢。
答案 0 :(得分:16)
从理论上讲,这将是随机而快速的。在实践中,它只会很快:
DELETE FROM tableX
LIMIT 4000
这将是随机但非常慢的,有600K行:
DELETE FROM tableX
ORDER BY RAND()
LIMIT 4000
这不是真正随机的(因为ids中通常存在间隙),它甚至可能不会删除正好4000行(但是当有很多间隙时会少一些)但它可能比之前更快。
需要在子查询中进行额外的换行,因为从多个表中删除的语法不允许LIMIT
:
DELETE td
FROM
tableX AS td
JOIN
( SELECT t.id
FROM
tableX AS t
CROSS JOIN
( SELECT MAX(id) AS maxid
FROM tableX
) AS m
JOIN
( SELECT RAND() AS rndm
FROM tableX AS tr
LIMIT 5000
) AS r
ON
t.id = CEIL( rndm * maxid )
LIMIT 4000
) AS x
ON
x.id = td.id
解释输出(子查询的输出,来自400K行表):
id table possible_keys key_len rows
select_type type key ref Extra
1 PRIMARY <derived2> system 1
1 PRIMARY <derived3> ALL 5000
1 PRIMARY t eq_ref PRIMARY PRIMARY 4 func 1 Using where;Using index
3 DERIVED tr index PRIMARY 4 398681 Using index
2 DERIVED Select tables optimized away
答案 1 :(得分:1)
delete from yourTable limit 4000
答案 2 :(得分:1)
如果我不得不猜测:
DELETE FROM table where id = (SELECT id FROM table ORDER BY rand() LIMIT 1) LIMIT 10
答案 3 :(得分:0)
DELETE FROM TABLE ORDER BY RAND() LIMIT 4000;
它需要时间......
更快的执行方式(不是编写代码!)可能是循环中的4000个单独删除
DELETE FROM TABLE WHERE AssumedPKisInt = <ARandomNumber>
当然,您需要确保不要尝试删除不存在或已删除的行。