根据@peter的评论编辑了问题所以现在这两个表使用相同的几何类型
我运行的查询将表格的几何图形与通用输入几何图形相交,然后根据特定属性汇总结果。
我对此特定查询怀疑(并且感到沮丧)的原因是我可以在大小为40倍的不同表上运行完全相同的查询,并且需要1/100的时间。
我很困惑,因为我对这个查询在130M记录的大表上的速度(~400ms)非常满意。但是,在具有30,000条记录的特定表上,查询需要40秒以上才能完成。
以下是查询:
WITH input_geom AS (
SELECT ST_Transform(
ST_SetSRID(
ST_GeomFromGeoJSON(
'{"type":"Polygon","coordinates":[[[-91.865616,47.803339],[-91.830597,47.780274],[-91.810341,47.817404],[-91.865616,47.803339]]]}'
), 4326
), 26915
) AS geom
)
-- find total area and proportion for each type
SELECT
attr,
total_area_summtable AS area,
total_area_summtable / buff_area.area_sqm AS percent
FROM
(-- group by attribute and buffer
SELECT
attr,
sum(area_sqm) AS total_area_summtable
FROM
(-- find intersected area of each type
-- Clip ownership by input geom
SELECT
%attr% AS attr,
CASE
-- speed intersection calculation by using summary table
-- geom when it covers the entire buffer
-- otherwise use intersection of geometries
WHEN ST_CoveredBy(input_geom.geom, summtable.geom) THEN ST_Area(input_geom.geom)
ELSE ST_Area(ST_Multi(ST_Intersection(input_geom.geom,summtable.geom)))
END AS area_sqm
FROM input_geom
INNER JOIN %table% AS summtable ON ST_Intersects(input_geom.geom, summtable.geom)
) AS summtable_inter
-- group by type
GROUP BY attr
) AS summtable_area,
(-- find total area for the buffer
SELECT
ST_Area(ST_Collect(geom)) AS area_sqm
FROM input_geom
) AS buff_area
产生如下结果:
attr area percent
6 17106063.3199902 0.0630578194718625
8 41892903.9272884 0.154429170732226
2 4441738.70688669 0.016373513430921
....
以下是此查询的Explain Analyze结果:
Nested Loop (cost=31.00..31.34 rows=9 width=23) (actual time=49042.306..49042.309 rows=5 loops=1)
Output: mown.owner_desc, (sum(CASE WHEN ((input_geom_1.geom @ mown.geom) AND _st_coveredby(input_geom_1.geom, mown.geom)) THEN st_area(input_geom_1.geom) ELSE st_area(st_multi(st_intersection(input_geom_1.geom, mown.geom))) END)), ((sum(CASE WHEN ((input (...)
CTE input_geom
-> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.002..0.002 rows=1 loops=1)
Output: '01030000202369000001000000040000003D484506D9D92141B7A4EA61F6325441A3BEFDC8A2EE2141E731F0497F305441774E0C95FEF9214173B409BA8C3454413D484506D9D92141B7A4EA61F6325441'::geometry
-> Aggregate (cost=0.02..0.06 rows=1 width=32) (actual time=0.035..0.036 rows=1 loops=1)
Output: st_area(st_collect(input_geom.geom))
-> CTE Scan on input_geom (cost=0.00..0.02 rows=1 width=32) (actual time=0.005..0.006 rows=1 loops=1)
Output: input_geom.geom
-> HashAggregate (cost=30.96..31.05 rows=9 width=18085) (actual time=49042.264..49042.266 rows=5 loops=1)
Output: mown.owner_desc, sum(CASE WHEN ((input_geom_1.geom @ mown.geom) AND _st_coveredby(input_geom_1.geom, mown.geom)) THEN st_area(input_geom_1.geom) ELSE st_area(st_multi(st_intersection(input_geom_1.geom, mown.geom))) END)
Group Key: mown.owner_desc
-> Nested Loop (cost=4.32..25.34 rows=18 width=18085) (actual time=3.304..791.829 rows=39 loops=1)
Output: input_geom_1.geom, mown.owner_desc, mown.geom
-> CTE Scan on input_geom input_geom_1 (cost=0.00..0.02 rows=1 width=32) (actual time=0.001..0.003 rows=1 loops=1)
Output: input_geom_1.geom
-> Bitmap Heap Scan on public.gap_stewardship_2008_all_ownership_types mown (cost=4.32..25.30 rows=2 width=18053) (actual time=3.299..791.762 rows=39 loops=1)
Output: mown.gid, mown.wetland_ty, mown.county, mown.name, mown.unit, mown.owner, mown.owner_ver1, mown.owner_desc, mown.owner_name, mown.agency, mown.agncy_ver1, mown.agency_nam, mown.new_manage, mown.name_manag, mown.comments, mown.or (...)
Recheck Cond: (input_geom_1.geom && mown.geom)
Filter: _st_intersects(input_geom_1.geom, mown.geom)
Rows Removed by Filter: 208
Heap Blocks: exact=142
-> Bitmap Index Scan on gap_stewardship_2008_all_ownership_types_geom_idx (cost=0.00..4.31 rows=5 width=0) (actual time=0.651..0.651 rows=247 loops=1)
Index Cond: (input_geom_1.geom && mown.geom)
Planning time: 1.245 ms
Execution time: 49046.184 ms
以下是重新创建相关表格的SQL:
慢表(50,000行):
CREATE TABLE public.gap_stewardship_2008_all_ownership_types
(
gid integer NOT NULL DEFAULT nextval('gap_stewardship_2008_all_ownership_types_gid_seq'::regclass),
wetland_ty character varying(50),
county character varying(50),
name character varying(50),
unit character varying(50),
owner smallint,
owner_ver1 smallint,
owner_desc character varying(50),
owner_name character varying(50),
agency smallint,
agncy_ver1 smallint,
agency_nam character varying(50),
new_manage smallint,
name_manag character varying(50),
comments character varying(100),
origin character varying(50),
area numeric,
acres numeric,
perfeet numeric,
perimeter numeric,
km2 numeric,
shape_leng numeric,
shape_area numeric,
geom geometry(MultiPolygon,26915),
CONSTRAINT gap_stewardship_2008_all_ownership_types_pkey PRIMARY KEY (gid)
)
快速表(1,300,000行):
CREATE TABLE public.nwi_combine
(
id integer NOT NULL DEFAULT nextval('"NWI_combine_AOI_id_seq"'::regclass),
geom geometry(MultiPolygon,26915),
attribute character varying(254),
wetland_ty character varying(254),
acres numeric,
hgm_code character varying(254),
hgm_desc character varying(254),
spcc_desc character varying(254),
cow_class1 character varying(254),
circ39_cla bigint,
hgm_ll_des character varying(254),
shape_leng numeric,
shape_area numeric,
nwi_code character varying(254),
new_cow character varying(254),
system character varying(254),
subsystem character varying(254),
class1 character varying(254),
subclass1 character varying(254),
class2 character varying(254),
subclass2 character varying(254),
wreg character varying(254),
soilm character varying(254),
spec_mod1 character varying(254),
spec_mod2 character varying(254),
circ39 character varying(254),
old_cow character varying(254),
mnwet character varying(254),
circ39_com bigint,
CONSTRAINT "NWI_combine_AOI_pkey" PRIMARY KEY (id)
)
这些表中的每一个都在几何字段上都有一个GIST索引。
有没有人知道可能导致这种差异的原因是什么?