我有一张郊区表,每个郊区都有一个geom值,代表地图上的多边形。还有另一张房子表,每个房子的地图上都有一个geom值。
两个geom列都使用gist索引,而郊区表也有索引名称列。 Suburbs表有8k +记录,而house表有300k +记录。
现在我的任务是找到一个名为“FOO'”的郊区内的所有房屋。
QUERY#1:
SELECT * FROM houses WHERE ST_INTERSECTS((SELECT geom FROM "suburbs" WHERE "suburb_name" = 'FOO'), geom);
查询计划结果:
Seq Scan on houses (cost=8.29..86327.26 rows=102365 width=136)
Filter: st_intersects($0, geom)
InitPlan 1 (returns $0)
-> Index Scan using suburbs_suburb_name on suburbs (cost=0.28..8.29 rows=1 width=32)
Index Cond: ((suburb_name)::text = 'FOO'::text)
运行查询需要大约3.5秒,返回486条记录。
QUERY#2:(用_前缀ST_INTERSECTS函数明确要求它不要使用索引)
SELECT * FROM houses WHERE _ST_INTERSECTS((SELECT geom FROM "suburbs" WHERE "suburb_name" = 'FOO'), geom);
查询计划结果:(与查询#1完全相同)
Seq Scan on houses (cost=8.29..86327.26 rows=102365 width=136)
Filter: st_intersects($0, geom)
InitPlan 1 (returns $0)
-> Index Scan using suburbs_suburb_name on suburbs (cost=0.28..8.29 rows=1 width=32)
Index Cond: ((suburb_name)::text = 'FOO'::text)
运行查询需要大约1.7秒,返回486条记录。
QUERY#3:(使用&&运算符在ST_Intersects函数之前添加边界框重叠检查)
SELECT * FROM houses WHERE (geom && (SELECT geom FROM "suburbs" WHERE "suburb_name" = 'FOO')) AND ST_INTERSECTS((SELECT geom FROM "suburbs" WHERE "suburb_name" = 'FOO'), geom);
查询计划结果:
Bitmap Heap Scan on houses (cost=21.11..146.81 rows=10 width=136)
Recheck Cond: (geom && $0)
Filter: st_intersects($1, geom)
InitPlan 1 (returns $0)
-> Index Scan using suburbs_suburb_name on suburbs (cost=0.28..8.29 rows=1 width=32)
Index Cond: ((suburb_name)::text = 'FOO'::text)
InitPlan 2 (returns $1)
-> Index Scan using suburbs_suburb_name on suburbs suburbs_1 (cost=0.28..8.29 rows=1 width=32)
Index Cond: ((suburb_name)::text = 'FOO'::text)
-> Bitmap Index Scan on houses_geom_gist (cost=0.00..4.51 rows=31 width=0)
Index Cond: (geom && $0)
运行查询需要0.15s,返回486条记录。
显然,只有查询#3从空间索引中获益,这显着提高了性能。但是,语法很难看并且在某种程度上重复了。我的问题是:
答案 0 :(得分:3)
尝试将查询展平为一个查询,而不需要不必要的子查询:
SELECT houses.*
FROM houses, suburbs
WHERE suburbs.suburb_name = 'FOO' AND ST_Intersects(houses.geom, suburbs.geom);