我使用PostGIS 2.3.3运行PostgreSQL 9.6
我试图让这个相当关键的查询变得更快(寻找1000米范围内的用户),但我在设置正确的索引方面遇到了麻烦。
有人能指出我正确的方向吗?
users
为200k行,locations
为1200行,user_push_tokens
为155k行
用户表:
create table users
(
id serial not null
constraint users_pkey
primary key,
(an additional 20-ish columns),
geo_point geometry(Point,4326)
);`
create unique index users_id
on users (id);
create index users_geo_point_idx
on users using gist(geo_point);
user_push_tokens表:
create table user_push_tokens
(
user_id integer not null
constraint push_tokens_user_id_fkey
references users
on delete cascade,
push_token varchar(255) not null,
push_provider varchar(64) not null,
app varchar(64) default 'app'::character varying not null,
id integer default nextval('user_push_tokens_id_seq'::regclass) not null
constraint user_push_tokens_pkey
primary key,
active boolean default true not null,
);
create index trinity_unique_index
on user_push_tokens (user_id, app, push_token);
create index user_push_tokens_token_fetch_idx
on user_push_tokens (user_id, app, active);
create index user_id_index
on user_push_tokens (user_id);
位置表:
create table locations
(
id integer default nextval('locations_id_seq'::regclass) not null
constraint locations_pkey
primary key,
(another 25 columns),
geo_point geometry(Point,4326)
);
create unique index locations_id_key
on locations (id);
create index locations_geo_point_idx
on locations using gist(geo_point);
查询
EXPLAIN ANALYSE SELECT
u.id AS user_id,
upt.push_provider,
upt.push_token
FROM users u
JOIN locations l ON l.id = 3896
JOIN user_push_tokens upt
ON upt.user_id = u.id AND upt.active = true AND upt.app = 'app'
WHERE ST_DistanceSphere(u.geo_point, l.geo_point) <= 1000;
结果
Nested Loop (cost=26087.06..63658.87 rows=30605 width=107) (actual time=353.304..887.371 rows=2498 loops=1)
Join Filter: (_st_distance(geography(u.geo_point), geography(l.geo_point), '0'::double precision, false) <= '1000'::double precision)
Rows Removed by Join Filter: 89539
-> Index Scan using locations_id_key on locations l (cost=0.28..8.29 rows=1 width=32) (actual time=0.009..0.014 rows=1 loops=1)
Index Cond: (id = 3896)
-> Hash Join (cost=26086.78..39090.07 rows=91815 width=139) (actual time=352.437..657.228 rows=92037 loops=1)
Hash Cond: (upt.user_id = u.id)
-> Seq Scan on user_push_tokens upt (cost=0.00..7162.82 rows=91815 width=107) (actual time=0.032..103.512 rows=92037 loops=1)
Filter: (active AND ((app)::text = 'app'::text))
Rows Removed by Filter: 62437
-> Hash (cost=22114.46..22114.46 rows=195546 width=36) (actual time=352.199..352.199 rows=195589 loops=1)
Buckets: 65536 Batches: 4 Memory Usage: 3563kB
-> Seq Scan on users u (cost=0.00..22114.46 rows=195546 width=36) (actual time=0.014..214.976 rows=195589 loops=1)
感谢您的时间。
答案 0 :(得分:1)
像这样的索引会使它稍快一些:
CREATE INDEX ON user_push_tokens (app) WHERE active;
但是没有办法通过函数调用来加速嵌套循环。
要研究的一个想法是添加第二个“宽松”距离检查,该检查是可索引的,可能有误报,但过滤掉大部分行。
不确定这是否可行,但值得考虑。