sqlite> .schema movie
CREATE TABLE movie (
id INTEGER PRIMARY KEY, title TEXT, year INTEGER, nth TEXT, for_video BOOLEAN
);
sqlite> select count(*) from movie;
count(*)
----------
530256
sqlite>
有关查询,人类历史上哪些年度至少发布过一部电影?
$ sqlite movie.db
sqlite> select DISTINCT year from movie order by year asc;
sqlite> select * from sqlite_master where type='index';
index|index1|movie|194061|CREATE INDEX index1 on movie (year)
sqlite> .quit
$
$ time printf "select DISTINCT year from movie order by year;" | sqlite3 movie.db > /dev/null
real 0m0.086s
user 0m0.064s
sys 0m0.020s
$ time printf "select DISTINCT year from movie indexed by index1 order by year;" | sqlite3 movie.db > /dev/null
real 0m0.092s
user 0m0.088s
sys 0m0.000s
我的理解是,要在movie
上运行选择查询,而不进行索引,需要530256
次扫描,因为表movie
有530256
条记录。为减少这些扫描,使用非关键字段index1
在表movie
上创建year
。
在索引方面,情况正在恶化。
使用索引,可以选择DISTINCT的查询进行优化吗?
索引是否仅通过 WHERE
子句和不带 GROUP BY
来增强sql查询的性能,提供特定(单个)结果组?