Question

我有一张郊区表，每个郊区都有一个geom值，代表地图上的多边形。还有另一张房子表，每个房子的地图上都有一个geom值。

两个geom列都使用gist索引，而郊区表也有索引名称列。 Suburbs表有8k +记录，而house表有300k +记录。

现在我的任务是找到一个名为“FOO＆＃39;”的郊区内的所有房屋。

QUERY＃1：

SELECT * FROM houses WHERE ST_INTERSECTS((SELECT geom FROM "suburbs" WHERE "suburb_name" = 'FOO'), geom);

查询计划结果：

Seq Scan on houses  (cost=8.29..86327.26 rows=102365 width=136)
  Filter: st_intersects($0, geom)
  InitPlan 1 (returns $0)
    ->  Index Scan using suburbs_suburb_name on suburbs  (cost=0.28..8.29 rows=1 width=32)
          Index Cond: ((suburb_name)::text = 'FOO'::text)

运行查询需要大约3.5秒，返回486条记录。

QUERY＃2：（用_前缀ST_INTERSECTS函数明确要求它不要使用索引）

SELECT * FROM houses WHERE _ST_INTERSECTS((SELECT geom FROM "suburbs" WHERE "suburb_name" = 'FOO'), geom);

查询计划结果:(与查询＃1完全相同）

Seq Scan on houses  (cost=8.29..86327.26 rows=102365 width=136)
  Filter: st_intersects($0, geom)
  InitPlan 1 (returns $0)
    ->  Index Scan using suburbs_suburb_name on suburbs  (cost=0.28..8.29 rows=1 width=32)
          Index Cond: ((suburb_name)::text = 'FOO'::text)

运行查询需要大约1.7秒，返回486条记录。

QUERY＃3：（使用＆amp;＆amp;运算符在ST_Intersects函数之前添加边界框重叠检查）

SELECT * FROM houses WHERE (geom && (SELECT geom FROM "suburbs" WHERE "suburb_name" = 'FOO')) AND ST_INTERSECTS((SELECT geom FROM "suburbs" WHERE "suburb_name" = 'FOO'), geom);

查询计划结果：

Bitmap Heap Scan on houses  (cost=21.11..146.81 rows=10 width=136)
  Recheck Cond: (geom && $0)
  Filter: st_intersects($1, geom)
  InitPlan 1 (returns $0)
    ->  Index Scan using suburbs_suburb_name on suburbs  (cost=0.28..8.29 rows=1 width=32)
          Index Cond: ((suburb_name)::text = 'FOO'::text)
  InitPlan 2 (returns $1)
    ->  Index Scan using suburbs_suburb_name on suburbs suburbs_1  (cost=0.28..8.29 rows=1 width=32)
          Index Cond: ((suburb_name)::text = 'FOO'::text)
  ->  Bitmap Index Scan on houses_geom_gist  (cost=0.00..4.51 rows=31 width=0)
        Index Cond: (geom && $0)

运行查询需要0.15s，返回486条记录。

显然，只有查询＃3从空间索引中获益，这显着提高了性能。但是，语法很难看并且在某种程度上重复了。我的问题是：

为什么postgis不够聪明，不能在查询＃1中使用空间索引？
为什么查询＃2与查询＃1相比有更好的性能，因为它们都没有使用索引？
任何使查询＃3更漂亮的建议？或者是否有更好的方法来构建查询以执行相同的操作？

Answer 1

尝试将查询展平为一个查询，而不需要不必要的子查询：

SELECT houses.*
FROM houses, suburbs
WHERE suburbs.suburb_name = 'FOO' AND ST_Intersects(houses.geom, suburbs.geom);

Postgis ST_Intersects查询不使用现有的空间索引

1 个答案: