我有一个像列一样的表 id |名字|日期|小组..
我想要做的是删除计数超过200的每个组的所有旧记录。
例如,我有一个名为" shoes"其中有400条记录 "礼金券"有300条记录,#34;电子产品"有100条记录等等
因此,在运行SQL查询后,我想要的是每个组(鞋子,礼品卡,电子产品等)的数量小于或等于200。 要删除的记录是按日期或按ID(自动增量)标识的旧记录。 来自" shoes"的200条记录该组将被删除,这些组的年龄大于保留的组或者ID低于保留的组。
答案 0 :(得分:2)
这种类型的问题在MySQL中有点不方便,因为它们没有实现像ROW_NUMBER()这样的SQL-99窗口函数。这已经a long-standing feature request,但到目前为止尚未实施。 MySQL差不多是only major RDBMS that does not have this feature。
这是一个在单个SQL语句中工作的解决方案,只能选择每个组的成员大于200。它使用名为user variables的MySQL功能,它将其值作为查询进程从一行到另一行。
DELETE f FROM foo AS f
JOIN (SELECT id, IF(@g = `group`, @rn:=@rn+1, @rn:=1) AS row_number, @g:=grp
FROM foo, (SELECT @g:=null, @rn:=0) _init
ORDER BY `group`, date desc) AS r
ON f.id = r.id AND r.row_number > 200;
在运行此程序之前(或删除数据的任何内容!),我建议您了解它的工作原理,并使用等效的SELECT对其进行测试,以确保选择要删除的行。
我用较小的数据集测试了这个。这是我运行时没有过滤的数据:
SELECT f.id, f.`group`, r.row_number FROM foo AS f
JOIN (SELECT id, IF(@g = `group`, @rn:=@rn+1, @rn:=1) AS row_number, @g:=grp
FROM foo, (SELECT @g:=null, @rn:=0) _init
ORDER BY `group`, date desc) AS r
ON f.id = r.id;
+----+--------+------------+
| id | group | row_number |
+----+--------+------------+
| 1 | 1 | 1 |
| 2 | 1 | 2 |
| 3 | 1 | 3 |
| 5 | 1 | 4 |
| 11 | 1 | 5 |
| 4 | 2 | 1 |
| 10 | 2 | 2 |
| 8 | 2 | 3 |
| 7 | 3 | 1 |
| 6 | 3 | 2 |
| 12 | 3 | 3 |
| 9 | 4 | 1 |
+----+--------+------------+
这是SELECT跳过每组的前2个:
SELECT f.id, f.`group`, r.row_number FROM foo AS f
JOIN (SELECT id, IF(@g = `group`, @rn:=@rn+1, @rn:=1) AS row_number, @g:=grp
FROM foo, (SELECT @g:=null, @rn:=0) _init
ORDER BY `group`, date desc) AS r
ON f.id = r.id AND r.row_number > 2;
+----+-------+------------+
| id | group | row_number |
+----+-------+------------+
| 3 | 1 | 3 |
| 5 | 1 | 4 |
| 11 | 1 | 5 |
| 8 | 2 | 3 |
| 12 | 3 | 3 |
+----+-------+------------+
答案 1 :(得分:1)
运行此psuedo-SQL
SELECT shoes.id FROM shoes ORDER BY Date DESC LIMIT 200
然后从中解析结果(数组..(1,2等) - 调用此$ IDS)
DELETE FROM shoes WHERE ID NOT IN ($IDS)
编辑:要将它作为SQL查询完成,有两种可能的方法。
<强> 1 即可。 DELETE FROM shoes WHERE ID NOT IN (SELECT shoes.id FROM shoes ORDER BY Date DESC LIMIT 200)
- 是的,你可以这样做。小心。正如比尔建议的那样,首先将其作为SELECT * FROM shoes WHERE ID NOT IN (SELECT shoes.id FROM shoes ORDER BY Date DESC LIMIT 200)
首先运行,以确保它选择了正确的东西[你想要删除!]
<强> 2 即可。对DECLARE了解不多,但您可以声明@IDs = SELECT shoes.id FROM shoes ORDER BY Date DESC LIMIT 200
,然后DELETE FROM shoes WHERE ID NOT IN (@IDS)
两者都未经过测试。顺便说一下,您应该使用SQLFiddle来设置模拟架构信息,以便当人们来帮助他们测试他们的查询时。
答案 2 :(得分:0)
这将是一个SQL Server解决方案
Select * from (
Select *, ROW_NUMBER() OVER (Partition By [Group] order by Date) RN
from table) t1
inner join (
Select [GROUP], COUNT(*) as Cnt
from table
group by [Group]
) a on a.[Group] = t1.[Group]
where t1.RN <= 200
and a.Cnt >= 200
编辑:
这是使用CTE
With CTE as
(
Select [GROUP], COUNT(*) as cnt
from tbl
group by [Group]
)
Select t1.*
from (Select *, ROW_NUMBER() OVER (Partition By [Group] order by Date) RN
from tbl) t1
inner join CTE a on a.[Group] = t1.[Group]
where t1.RN <= 200 and
a.Cnt >= 200