sqlite索引是否通过存储int列的最大值和最小值来优化?

时间:2019-01-04 04:41:50

标签: database sqlite indexing

我很好奇sqlite索引是否存储一列的最大值和最小值以帮助优化针对它的查询。我当时在想,如果我们有一个包含数百万条记录的大文件,而索引恰好存储了最大值和最小值,并且该列上的条件大于或小于该条件,那么查询可以立即告诉我们该记录不存在不必首先搜索数据库文件。有人会碰巧知道max和min通常是否存储在诸如sqlite之类的db索引中?

2 个答案:

答案 0 :(得分:1)

据我所知,SQlite并不存储独立于数据的此类值。但是,您可以通过创建一个表来存储最小值和最大值以及每次插入一行时轻松地做到这一点。

但是,当必须更改最小值或最大值时,更新和删除可能会更耗时。

同时存储行的rowid可能会更有效。

这样的表:-

CREATE TABLE IF NOT EXISTS minmax_store(max_value INTEGER, max_rowid INTEGER, min_value INTEGER, min_rowid INTEGER);

以下是一个适合的演示。它使用触发器来维护 minmax_store 表:-

DROP TABLE IF EXISTS mydata;
CREATE TABLE IF NOT EXISTS mydata(id INTEGER PRIMARY KEY, myvalue INTEGER);
DROP TABLE IF EXISTS minmax_store;
CREATE TABLE IF NOT EXISTS minmax_store (max_value INTEGER, max_rowid INTEGER, min_value INTEGER, min_rowid INTEGER);
INSERT INTO minmax_store VALUES(-9223372036854775806,-1,9223372036854775807,-1);
DROP TRIGGER IF EXISTS maintain_minmax_after_insert;
CREATE TRIGGER IF NOT EXISTS maintain_minmax_after_insert AFTER INSERT ON mydata
    BEGIN
        UPDATE minmax_store SET max_value = new.myvalue, max_rowid = new.id WHERE max_value < new.myvalue;
      UPDATE minmax_store SET min_value = new.myvalue, min_rowid = new.id WHERE min_value > new.myvalue;    
    END
;
DROP TRIGGER IF EXISTS maintain_minmax_after_delete;
CREATE TRIGGER IF NOT EXISTS maintain_minmax_after_delete AFTER DELETE ON mydata 
    WHEN (SELECT max_value FROM minmax_store) = old.myvalue OR (SELECT min_value FROM minmax_store) = old.myvalue
    BEGIN
        UPDATE minmax_store 
            SET max_value = (SELECT max(myvalue) FROM mydata), max_rowid = (SELECT rowid FROM mydata ORDER BY myvalue DESC LIMIT 1),
            min_value = (SELECT min(myvalue) FROM mydata), min_rowid = (SELECT rowid FROM mydata ORDER BY myvalue ASC LIMIT 1);
    END
;
DROP TRIGGER IF EXISTS maintain_minmax_after_update;
CREATE TRIGGER IF NOT EXISTS maintain_minmax_after_update AFTER UPDATE ON mydata
    WHEN (SELECT max_value FROM minmax_store) = old.myvalue 
        OR (SELECT min_value FROM minmax_store) = old.myvalue  
        OR (SELECT max_value FROM minmax_store) < new.myvalue
      OR (SELECT min_value FROM minmax_store) > new.myvalue
    BEGIN
        UPDATE minmax_store
            SET max_value = (SELECT max(myvalue) FROM mydata), max_rowid = (SELECT rowid FROM mydata ORDER BY myvalue DESC LIMIT 1),
            min_value = (SELECT min(myvalue) FROM mydata), min_rowid = (SELECT rowid FROM mydata ORDER BY myvalue ASC LIMIT 1);
  END
;   

INSERT INTO mydata (myvalue) VALUES(1),(4),(6),(7),(8),(3),(5),(0),(9),(100),(200),(55),(66),(33),(4421);
SELECT * FROM minmax_store;

SELECT *, 
    CASE 
        WHEN myvalue = (SELECT max_value FROM minmax_store) THEN 'MAX VALUE HERE' ELSE '' END AS isrowmaxvalue,
    CASE
        WHEN myvalue = (SELECT min_value FROM minmax_store) THEN 'MIN VALUE HERE' ELSE '' END AS isrowminvalue
    FROM mydata;

DELETE FROM mydata WHERE myvalue = (SELECT min(myvalue) FROM mydata);

SELECT * FROM minmax_store;
SELECT *, 
    CASE 
        WHEN myvalue = (SELECT max_value FROM minmax_store) THEN 'MAX VALUE HERE' ELSE '' END AS isrowmaxvalue,
    CASE
        WHEN myvalue = (SELECT min_value FROM minmax_store) THEN 'MIN VALUE HERE' ELSE '' END AS isrowminvalue
    FROM mydata;

UPDATE mydata SET myvalue = (SELECT max_value FROM minmax_store) + 10 WHERE myvalue = (SELECT min_value FROM minmax_store);
SELECT * FROM minmax_store;
SELECT *, 
    CASE 
        WHEN myvalue = (SELECT max_value FROM minmax_store) THEN 'MAX VALUE HERE' ELSE '' END AS isrowmaxvalue,
    CASE
        WHEN myvalue = (SELECT min_value FROM minmax_store) THEN 'MIN VALUE HERE' ELSE '' END AS isrowminvalue
    FROM mydata

第一个查询返回(minmax_store表):-

enter image description here

第二个查询返回:-

enter image description here

删除最小行(值0)后的第三个查询将更改后的 minmax_store 显示为:-

enter image description here

第四个查询返回(与第二个查询相同)返回:-

enter image description here

第五条查询在将具有最小值的行修改为最大值+ 10(4431)之后,将更改后的 minmax_store 显示为:-

enter image description here

第六个查询返回(与第二和第四查询相同)返回:-

enter image description here

  • 注意:以上内容是作为原则代码提供的,尚未经过广泛测试,因此可能包含一些错误和效率低下的问题。

答案 1 :(得分:1)

最小值和最大值没有分开存储。

但是,它们是索引中的第一个和最后一个条目,因此可以快速读取它们。这称为MIN/MAX optimization

  

通过执行单个索引查找而不是通过扫描整个表,可以满足包含单个MIN()或MAX()聚合函数(其参数是索引的最左列)的查询。例子:

SELECT MIN(x) FROM table;
SELECT MAX(x)+1 FROM table;

如果您要搜索超出列值范围的特定值,则对索引进行二进制搜索将快速确定没有包含匹配值的页面。 (索引B树的上层总会被缓存,因此在其他位置创建副本是没有意义的。)