要找到一个清晰明确的标题并不容易。 :D 我的问题:我将有关存储在NAS上的某些日志的信息保存在数据库中。我想删除超过大小限制的旧日志(按存储)。 例如:
rec storage name size status
1 x pip 85 discarded
2 x foo 25 available
3 x bla 45 available
4 x bar 35 available
5 x wow 50 available
5 z sid 25 available
如果大小限制为100,则必须删除foo和bla。但是我不想从数据库中删除记录,我只想将它们标记为已丢弃。如何编写单个查询以返回该列表(在我的情况下为foo,bla)并将其标记为已丢弃?在O(n)中更可取。
答案 0 :(得分:0)
假定rec是一个自动生成的整数主键(并且重复的5记录是一个错字),因此该键越小日志越旧,这种方法适用于Sqlite 3.25和更高版本:
-- Create table and populate with sample data.
CREATE TABLE logs(rec INTEGER PRIMARY KEY AUTOINCREMENT
, storage TEXT
, name TEXT
, size INTEGER
, status TEXT);
INSERT INTO logs VALUES(1,'x','pip',85,'discarded');
INSERT INTO logs VALUES(2,'x','foo',25,'available');
INSERT INTO logs VALUES(3,'x','bla',45,'available');
INSERT INTO logs VALUES(4,'x','bar',35,'available');
INSERT INTO logs VALUES(5,'x','wow',50,'available');
INSERT INTO logs VALUES(6,'z','sid',25,'available');
(我通常不喜欢将AUTOINCREMENT
与sqlite3一起使用,但这是需要调用它的一种情况,因为它会阻止已生成的rowid再次被使用。如果没有它,将很难发生,但让我们变得偏执。)
此查询计算特定存储类别中所有可用日志的总大小,其中总大小限制为当前日志和所有较新的(较高记录)日志。它需要3.25或更高版本,因为那是将窗口功能添加到sqlite的时候。
SELECT *
, sum(size) OVER (PARTITION BY storage
ORDER BY rec
RANGE BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
AS total_size
FROM logs
WHERE status = 'available'
ORDER BY rec;
产生:
rec storage name size status total_size
---------- ---------- ---------- ---------- ---------- -----------
2 x foo 25 available 155
3 x bla 45 available 130
4 x bar 35 available 85
5 x wow 50 available 50
6 z sid 25 available 25
要仅获得total_size
列大于100的行,必须将其包装在“公用表表达式”中,因为您不能在WHERE
子句中使用窗口函数生成的值:>
WITH running_totals AS
(SELECT *
, sum(size) OVER (PARTITION BY storage
ORDER BY rec
RANGE BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
AS total_size
FROM logs
WHERE status = 'available')
SELECT * FROM running_totals WHERE total_size > 100 ORDER BY rec;
这将产生:
rec storage name size status total_size
---------- ---------- ---------- ---------- ---------- -----------
2 x foo 25 available 155
3 x bla 45 available 130
您想要哪些条目。要将这些行标记为已丢弃,也可以将CTE与UPDATE语句一起使用:
WITH running_totals AS
(SELECT rec
, sum(size) OVER (PARTITION BY storage
ORDER BY rec
RANGE BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
AS total_size
FROM logs
WHERE status = 'available')
UPDATE logs
SET status = 'discarded'
WHERE rec IN (SELECT rec FROM running_totals WHERE total_size > 100);