我很好奇sqlite索引是否存储一列的最大值和最小值以帮助优化针对它的查询。我当时在想,如果我们有一个包含数百万条记录的大文件,而索引恰好存储了最大值和最小值,并且该列上的条件大于或小于该条件,那么查询可以立即告诉我们该记录不存在不必首先搜索数据库文件。有人会碰巧知道max和min通常是否存储在诸如sqlite之类的db索引中?
答案 0 :(得分:1)
据我所知,SQlite并不存储独立于数据的此类值。但是,您可以通过创建一个表来存储最小值和最大值以及每次插入一行时轻松地做到这一点。
但是,当必须更改最小值或最大值时,更新和删除可能会更耗时。
同时存储行的rowid可能会更有效。
这样的表:-
CREATE TABLE IF NOT EXISTS minmax_store(max_value INTEGER, max_rowid INTEGER, min_value INTEGER, min_rowid INTEGER);
以下是一个适合的演示。它使用触发器来维护 minmax_store 表:-
DROP TABLE IF EXISTS mydata;
CREATE TABLE IF NOT EXISTS mydata(id INTEGER PRIMARY KEY, myvalue INTEGER);
DROP TABLE IF EXISTS minmax_store;
CREATE TABLE IF NOT EXISTS minmax_store (max_value INTEGER, max_rowid INTEGER, min_value INTEGER, min_rowid INTEGER);
INSERT INTO minmax_store VALUES(-9223372036854775806,-1,9223372036854775807,-1);
DROP TRIGGER IF EXISTS maintain_minmax_after_insert;
CREATE TRIGGER IF NOT EXISTS maintain_minmax_after_insert AFTER INSERT ON mydata
BEGIN
UPDATE minmax_store SET max_value = new.myvalue, max_rowid = new.id WHERE max_value < new.myvalue;
UPDATE minmax_store SET min_value = new.myvalue, min_rowid = new.id WHERE min_value > new.myvalue;
END
;
DROP TRIGGER IF EXISTS maintain_minmax_after_delete;
CREATE TRIGGER IF NOT EXISTS maintain_minmax_after_delete AFTER DELETE ON mydata
WHEN (SELECT max_value FROM minmax_store) = old.myvalue OR (SELECT min_value FROM minmax_store) = old.myvalue
BEGIN
UPDATE minmax_store
SET max_value = (SELECT max(myvalue) FROM mydata), max_rowid = (SELECT rowid FROM mydata ORDER BY myvalue DESC LIMIT 1),
min_value = (SELECT min(myvalue) FROM mydata), min_rowid = (SELECT rowid FROM mydata ORDER BY myvalue ASC LIMIT 1);
END
;
DROP TRIGGER IF EXISTS maintain_minmax_after_update;
CREATE TRIGGER IF NOT EXISTS maintain_minmax_after_update AFTER UPDATE ON mydata
WHEN (SELECT max_value FROM minmax_store) = old.myvalue
OR (SELECT min_value FROM minmax_store) = old.myvalue
OR (SELECT max_value FROM minmax_store) < new.myvalue
OR (SELECT min_value FROM minmax_store) > new.myvalue
BEGIN
UPDATE minmax_store
SET max_value = (SELECT max(myvalue) FROM mydata), max_rowid = (SELECT rowid FROM mydata ORDER BY myvalue DESC LIMIT 1),
min_value = (SELECT min(myvalue) FROM mydata), min_rowid = (SELECT rowid FROM mydata ORDER BY myvalue ASC LIMIT 1);
END
;
INSERT INTO mydata (myvalue) VALUES(1),(4),(6),(7),(8),(3),(5),(0),(9),(100),(200),(55),(66),(33),(4421);
SELECT * FROM minmax_store;
SELECT *,
CASE
WHEN myvalue = (SELECT max_value FROM minmax_store) THEN 'MAX VALUE HERE' ELSE '' END AS isrowmaxvalue,
CASE
WHEN myvalue = (SELECT min_value FROM minmax_store) THEN 'MIN VALUE HERE' ELSE '' END AS isrowminvalue
FROM mydata;
DELETE FROM mydata WHERE myvalue = (SELECT min(myvalue) FROM mydata);
SELECT * FROM minmax_store;
SELECT *,
CASE
WHEN myvalue = (SELECT max_value FROM minmax_store) THEN 'MAX VALUE HERE' ELSE '' END AS isrowmaxvalue,
CASE
WHEN myvalue = (SELECT min_value FROM minmax_store) THEN 'MIN VALUE HERE' ELSE '' END AS isrowminvalue
FROM mydata;
UPDATE mydata SET myvalue = (SELECT max_value FROM minmax_store) + 10 WHERE myvalue = (SELECT min_value FROM minmax_store);
SELECT * FROM minmax_store;
SELECT *,
CASE
WHEN myvalue = (SELECT max_value FROM minmax_store) THEN 'MAX VALUE HERE' ELSE '' END AS isrowmaxvalue,
CASE
WHEN myvalue = (SELECT min_value FROM minmax_store) THEN 'MIN VALUE HERE' ELSE '' END AS isrowminvalue
FROM mydata
第一个查询返回(minmax_store表):-
第二个查询返回:-
删除最小行(值0)后的第三个查询将更改后的 minmax_store 显示为:-
第四个查询返回(与第二个查询相同)返回:-
第五条查询在将具有最小值的行修改为最大值+ 10(4431)之后,将更改后的 minmax_store 显示为:-
第六个查询返回(与第二和第四查询相同)返回:-
答案 1 :(得分:1)
最小值和最大值没有分开存储。
但是,它们是索引中的第一个和最后一个条目,因此可以快速读取它们。这称为MIN/MAX optimization:
通过执行单个索引查找而不是通过扫描整个表,可以满足包含单个MIN()或MAX()聚合函数(其参数是索引的最左列)的查询。例子:
SELECT MIN(x) FROM table; SELECT MAX(x)+1 FROM table;
如果您要搜索超出列值范围的特定值,则对索引进行二进制搜索将快速确定没有包含匹配值的页面。 (索引B树的上层总会被缓存,因此在其他位置创建副本是没有意义的。)