今天我对表进行了一些更改,试图让某些类型的查询运行得更快。这是表(在我更改之前):
CREATE TABLE IF NOT EXISTS street_addresses (
id INTEGER PRIMARY KEY NOT NULL,
house_number INTEGER NOT NULL,
entrance TEXT NOT NULL,
latitude REAL NOT NULL,
longitude REAL NOT NULL,
street_name INTEGER NOT NULL REFERENCES street_names(id),
postal_code INTEGER NOT NULL REFERENCES postal_codes(id),
city INTEGER NOT NULL REFERENCES cities(id),
municipality INTEGER NOT NULL REFERENCES municipalities(id),
CONSTRAINT unique_address UNIQUE(
street_name, house_number, entrance, postal_code, city
)
)
此表有两个索引(我可以识别):主键和5列的唯一键。我经常需要仅使用门牌号和邮政编码列或门牌号和 city <查询街道地址/ em> columns,所以我将表创建SQL更改为:
CREATE TABLE IF NOT EXISTS street_addresses (
id INTEGER PRIMARY KEY NOT NULL,
house_number INTEGER NOT NULL,
entrance TEXT NOT NULL,
latitude REAL NOT NULL,
longitude REAL NOT NULL,
street_name INTEGER NOT NULL REFERENCES street_names,
postal_code INTEGER NOT NULL REFERENCES postal_codes,
city INTEGER NOT NULL REFERENCES cities,
municipality INTEGER NOT NULL REFERENCES municipalities
);
CREATE INDEX IF NOT EXISTS sa_hn_pc
ON street_addresses (house_number, postal_code);
CREATE INDEX IF NOT EXISTS sa_hn_ci
ON street_addresses (house_number, city);
CREATE UNIQUE INDEX IF NOT EXISTS sa_unique_address
ON street_addresses (
street_name, house_number, entrance, postal_code, city
);
我添加了两个索引,并将UNIQUE索引从表定义中移出(这样我就可以将所有密钥放在一个位置。)此外,我从(id)
行中删除了REFERENCES
,因为根据文档,它默认使用主键。我的数据库现在要大得多,但至少使用门牌号和邮政编码获取地址要快几十倍!
很遗憾,按街道名称和门牌号码搜索的查询(这是我数据库最常见的查询类型)似乎不再使用我的索引。在表格更改之前我使用街道名称和门牌号每秒读取约1700次,现在我得到~50。如果我使用所有5列进行搜索,我仍然可以获得良好的旧速度,但是现在只使用UNIQUE键中的前2列非常慢。
此外,使用门牌号和城市的查询仍然几乎与以前一样慢,比使用门牌号和邮政编码搜索要慢得多。
知道这是怎么回事吗?我是否需要为街道名称和门牌号定义新索引,即使这些列是UNIQUE键的一部分?如果是这样,为什么我的查询之前如此之快?此外,为什么房屋号码和城市查询与房屋号码和邮政编码查询的速度相同?
对不起文字墙。我希望有人可以提供帮助。以下是我使用的选择查询:
在换桌之前:
$ bin/benchmark_norway_database --search-by-components 10000 --street_name --house_number [ ============================ 100% (10000/10000) ============================ ] 5.9129 seconds 0.0006 seconds per interval 1691 intervals per second $ bin/benchmark_norway_database --search-by-components 10000 --street_name --house_number --entrance --postal_code --city [ ============================ 100% (10000/10000) ============================ ] 3.2198 seconds 0.0003 seconds per interval 3106 intervals per second $ bin/benchmark_norway_database --search-by-components 100 --house_number --postal_code [ ============================== 100% (100/100) ============================== ] 9.957 seconds 0.0996 seconds per interval 10 intervals per second $ bin/benchmark_norway_database --search-by-components 100 --house_number --city [ ============================== 100% (100/100) ============================== ] 10.2446 seconds 0.1024 seconds per interval 10 intervals per second
更改表后:
# This is now so dreadfully slow I can't do 10000 intervals. $ bin/benchmark_norway_database --search-by-components 500 --street_name --house_number [ ============================== 100% (500/500) ============================== ] 9.5749 seconds 0.0191 seconds per interval 52 intervals per second # Still fast! $ bin/benchmark_norway_database --search-by-components 10000 --street_name --house_number --entrance --postal_code --city [ ============================ 100% (10000/10000) ============================ ] 3.4125 seconds 0.0003 seconds per interval 2930 intervals per second # Much, much faster than before! $ bin/benchmark_norway_database --search-by-components 10000 --house_number --postal_code [ ============================ 100% (10000/10000) ============================ ] 22.2646 seconds 0.0022 seconds per interval 449 intervals per second # Still slow? Why? :S $ bin/benchmark_norway_database --search-by-components 500 --house_number --city [ ============================== 100% (500/500) ============================== ] 14.3483 seconds 0.0287 seconds per interval 35 intervals per second
SELECT
sn.name, sa.house_number, sa.entrance, pc.postal_code,
ci.name, mu.name, co.name, sa.latitude, sa.longitude
FROM
street_addresses AS sa
INNER JOIN street_names AS sn ON sa.street_name = sn.id
INNER JOIN postal_codes AS pc ON sa.postal_code = pc.id
INNER JOIN cities AS ci ON sa.city = ci.id
INNER JOIN municipalities AS mu ON sa.municipality = mu.id
INNER JOIN counties AS co ON mu.county = co.id
WHERE
...
ORDER BY
ci.name ASC, sn.name ASC, sa.house_number ASC, sa.entrance ASC
LIMIT
0, 100
注意:在WHERE
部分,我在搜索街道名称时使用GLOB,例如:
WHERE
sn.name GLOB "FORNEBUVEIEN" AND
sa.house_number = 11
CREATE TABLE IF NOT EXISTS counties (
id INTEGER PRIMARY KEY NOT NULL,
name TEXT UNIQUE NOT NULL
)
CREATE TABLE IF NOT EXISTS municipalities (
id INTEGER PRIMARY KEY NOT NULL,
name TEXT NOT NULL,
number INTEGER NOT NULL,
county INTEGER NOT NULL REFERENCES counties,
CONSTRAINT unique_municipality UNIQUE(name, county)
);
CREATE UNIQUE INDEX IF NOT EXISTS mu_number
ON municipalities (number);
CREATE UNIQUE INDEX IF NOT EXISTS mu_unique_name_co
ON municipalities (name, county);
CREATE TABLE IF NOT EXISTS cities (
id INTEGER PRIMARY KEY NOT NULL,
name TEXT NOT NULL,
municipality INTEGER NOT NULL REFERENCES municipalities
);
CREATE UNIQUE INDEX IF NOT EXISTS ci_unique_name_mu
ON cities (name, municipality);
CREATE TABLE IF NOT EXISTS postal_codes (
id INTEGER PRIMARY KEY NOT NULL,
postal_code INTEGER NOT NULL,
city INTEGER NOT NULL REFERENCES cities
);
CREATE UNIQUE INDEX IF NOT EXISTS po_postal_code
ON postal_codes (postal_code);
CREATE TABLE IF NOT EXISTS street_names (
id INTEGER PRIMARY KEY NOT NULL,
name TEXT NOT NULL
);
CREATE UNIQUE INDEX IF NOT EXISTS sn_name
ON street_names (name);
CREATE TABLE IF NOT EXISTS street_addresses (
id INTEGER PRIMARY KEY NOT NULL,
house_number INTEGER NOT NULL,
entrance TEXT NOT NULL,
latitude REAL NOT NULL,
longitude REAL NOT NULL,
street_name INTEGER NOT NULL REFERENCES street_names,
postal_code INTEGER NOT NULL REFERENCES postal_codes,
city INTEGER NOT NULL REFERENCES cities,
municipality INTEGER NOT NULL REFERENCES municipalities
);
CREATE INDEX IF NOT EXISTS sa_hn_pc
ON street_addresses (house_number, postal_code);
CREATE INDEX IF NOT EXISTS sa_hn_ci
ON street_addresses (house_number, city);
CREATE UNIQUE INDEX IF NOT EXISTS sa_unique_address
ON street_addresses (
street_name, house_number, entrance, postal_code, city
);
PRAGMA journal_mode = OFF
PRAGMA page_size = 65536
VACUUM
sqlite> EXPLAIN QUERY PLAN SELECT sn.name, sa.house_number, sa.entrance, pc.postal_code, ci.name, mu.name, co.name, sa.latitude, sa.longitude FROM street_addresses AS sa INNER JOIN street_names AS sn ON sa.street_name = sn.id INNER JOIN postal_codes AS pc ON sa.postal_code = pc.id INNER JOIN cities AS ci ON sa.city = ci.id INNER JOIN municipalities AS mu ON sa.municipality = mu.id INNER JOIN counties AS co ON mu.county = co.id WHERE sn.name GLOB "FORNEBUVEIEN" AND sa.house_number=11 ORDER BY ci.name ASC, sn.name ASC, sa.house_number ASC, sa.entrance ASC LIMIT 0, 100; selectid order from detail ---------- ---------- ---------- ------------------------------------------------------------------------- 0 0 0 SEARCH TABLE street_addresses AS sa USING INDEX sa_hn_ci (house_number=?) 0 1 1 SEARCH TABLE street_names AS sn USING INTEGER PRIMARY KEY (rowid=?) 0 2 2 SEARCH TABLE postal_codes AS pc USING INTEGER PRIMARY KEY (rowid=?) 0 3 3 SEARCH TABLE cities AS ci USING INTEGER PRIMARY KEY (rowid=?) 0 4 4 SEARCH TABLE municipalities AS mu USING INTEGER PRIMARY KEY (rowid=?) 0 5 5 SEARCH TABLE counties AS co USING INTEGER PRIMARY KEY (rowid=?) 0 0 0 USE TEMP B-TREE FOR ORDER BY
答案 0 :(得分:0)
事实证明,在我的WHERE
查询中使用SELECT
这样的部分:
WHERE
sn.name GLOB ? AND
sa.house_number = ?
SQLite3选择索引sa_hn_ci
(house_number,city)而不是sa_unique_address
。这使得查询运行速度大约慢了100倍。
我现在每当我的查询包含街道名称时使用INDEXED BY
解决此问题:
SELECT
sn.name, sa.house_number, sa.entrance, pc.postal_code,
ci.name, mu.name, co.name, sa.latitude, sa.longitude
FROM
street_addresses AS sa INDEXED BY sa_unique_address -- This line!
INNER JOIN street_names AS sn ON sa.street_name = sn.id
INNER JOIN postal_codes AS pc ON sa.postal_code = pc.id
INNER JOIN cities AS ci ON sa.city = ci.id
INNER JOIN municipalities AS mu ON sa.municipality = mu.id
INNER JOIN counties AS co ON mu.county = co.id
WHERE
sn.name GLOB "FORNEBUVEIEN" AND
sa.house_number=11
ORDER BY
ci.name ASC, sn.name ASC, sa.house_number ASC, sa.entrance ASC
LIMIT
0, 100;
但我不知道为什么SQLite3选择了错误的索引。正在运行ANALYZE
并没有改变任何内容。
我没有将此答案标记为正确。