查询从每个组中删除早于n个活动日期的记录

时间:2015-11-30 18:49:10

标签: c# sql sql-server tsql sql-server-2012

我有一个我需要每天修剪的数据集。它是从一个定期将记录写入表中的进程填充的。

我目前有一个简单的查询:

DELETE FROM dataTable WHERE entryDate < dateadd(day, -5, GETDATE())

但问题是这个过程不可靠;可能有几天没有写任何数据。

所以我真正需要的是一个可以追溯5(可能是非连续的)天写入数据的查询,不是5个日历日。

例如,如果我运行以下查询:

SELECT cast(entryDate  as date) as LogDate
  FROM dataTable
  group by category, cast(entryDate as date)
  order by cast(entryDate as date) desc

我可能会得到结果:

Category    Date
Foo        2015-11-30
Foo        2015-11-29
Foo        2015-11-26
Foo        2015-11-25
Foo        2015-11-21
Foo        2015-11-20  <-- Start Pruning here, not the 25th.
Foo        2015-11-19
Foo        2015-11-18

Bar        2015-11-30
Bar        2015-11-29
Bar        2015-11-28
Bar        2015-11-27
Bar        2015-11-26
Bar        2015-11-25  <-- This one is OK to prune at the 25th.
Bar        2015-11-24
Bar        2015-11-23

我需要在删除之前将查询一直追溯到20日。

2 个答案:

答案 0 :(得分:3)

您可以使用row_number获取表格中有条目的最后5天。然后根据生成的数字删除。

SQL Fiddle

with rownums as (SELECT row_number() over(partition by category order by cast(entryDate as date) desc) as rn
                 ,*
                 FROM dataTable
)
delete from rownums where rn <= 5 --use > 5 for records prior to the last 5 days

如果每天可以有多个条目,请使用dense_rank对行进行编号。

with rownums as (SELECT dense_rank() over(partition by category order by cast(entryDate as date) desc) as rn
                     ,*
                 FROM dataTable)
delete from rownums where rn > 5;

答案 1 :(得分:1)

尝试类似这样的事情。

;WITH orderedDates (LogDate, RowNum)
AS 
(
SELECT [CACHEDATE] AS LogDate, ROW_NUMBER() OVER (ORDER BY CACHEDATE DESC) AS RowNum
FROM
dataTable
GROUP BY CACHEDATE
)
DELETE dataTable
WHERE CACHEDATE IN
(SELECT LogDate FROM orderedDates
WHERE ROWNUM > 5) --or however many you need to go back