对于特定的表,PostGIS相交/摘要查询非常慢

时间:2017-11-15 22:14:40

标签: postgresql postgis

根据@peter的评论编辑了问题所以现在这两个表使用相同的几何类型

我运行的查询将表格的几何图形与通用输入几何图形相交,然后根据特定属性汇总结果。

我对此特定查询怀疑(并且感到沮丧)的原因是我可以在大小为40倍的不同表上运行完全相同的查询,并且需要1/100的时间。

我很困惑,因为我对这个查询在130M记录的大表上的速度(~400ms)非常满意。但是,在具有30,000条记录的特定表上,查询需要40秒以上才能完成。

以下是查询:

WITH input_geom AS (
    SELECT ST_Transform(
        ST_SetSRID(
            ST_GeomFromGeoJSON(
                '{"type":"Polygon","coordinates":[[[-91.865616,47.803339],[-91.830597,47.780274],[-91.810341,47.817404],[-91.865616,47.803339]]]}'
            ), 4326
        ), 26915
    ) AS geom
)


-- find total area and proportion for each type
SELECT
    attr,
    total_area_summtable AS area,
    total_area_summtable / buff_area.area_sqm AS percent
FROM
(-- group by attribute and buffer
    SELECT
        attr,
        sum(area_sqm) AS total_area_summtable
    FROM
    (-- find intersected area of each type
        -- Clip ownership by input geom
        SELECT
            %attr% AS attr,
            CASE
                -- speed intersection calculation by using summary table
                -- geom when it covers the entire buffer
                -- otherwise use intersection of geometries
                WHEN ST_CoveredBy(input_geom.geom, summtable.geom) THEN ST_Area(input_geom.geom)
                ELSE ST_Area(ST_Multi(ST_Intersection(input_geom.geom,summtable.geom)))
            END AS area_sqm
        FROM input_geom
        INNER JOIN %table% AS summtable ON ST_Intersects(input_geom.geom, summtable.geom)

    ) AS summtable_inter
    -- group by type
    GROUP BY attr
) AS summtable_area,
(-- find total area for the buffer
    SELECT
        ST_Area(ST_Collect(geom)) AS area_sqm
    FROM input_geom
) AS buff_area

产生如下结果:

attr  area              percent  
6     17106063.3199902  0.0630578194718625  
8     41892903.9272884  0.154429170732226  
2     4441738.70688669  0.016373513430921  
....

以下是此查询的Explain Analyze结果:

    Nested Loop  (cost=31.00..31.34 rows=9 width=23) (actual time=49042.306..49042.309 rows=5 loops=1)
  Output: mown.owner_desc, (sum(CASE WHEN ((input_geom_1.geom @ mown.geom) AND _st_coveredby(input_geom_1.geom, mown.geom)) THEN st_area(input_geom_1.geom) ELSE st_area(st_multi(st_intersection(input_geom_1.geom, mown.geom))) END)), ((sum(CASE WHEN ((input (...)
  CTE input_geom
    ->  Result  (cost=0.00..0.01 rows=1 width=0) (actual time=0.002..0.002 rows=1 loops=1)
          Output: '01030000202369000001000000040000003D484506D9D92141B7A4EA61F6325441A3BEFDC8A2EE2141E731F0497F305441774E0C95FEF9214173B409BA8C3454413D484506D9D92141B7A4EA61F6325441'::geometry
  ->  Aggregate  (cost=0.02..0.06 rows=1 width=32) (actual time=0.035..0.036 rows=1 loops=1)
        Output: st_area(st_collect(input_geom.geom))
        ->  CTE Scan on input_geom  (cost=0.00..0.02 rows=1 width=32) (actual time=0.005..0.006 rows=1 loops=1)
              Output: input_geom.geom
  ->  HashAggregate  (cost=30.96..31.05 rows=9 width=18085) (actual time=49042.264..49042.266 rows=5 loops=1)
        Output: mown.owner_desc, sum(CASE WHEN ((input_geom_1.geom @ mown.geom) AND _st_coveredby(input_geom_1.geom, mown.geom)) THEN st_area(input_geom_1.geom) ELSE st_area(st_multi(st_intersection(input_geom_1.geom, mown.geom))) END)
        Group Key: mown.owner_desc
        ->  Nested Loop  (cost=4.32..25.34 rows=18 width=18085) (actual time=3.304..791.829 rows=39 loops=1)
              Output: input_geom_1.geom, mown.owner_desc, mown.geom
              ->  CTE Scan on input_geom input_geom_1  (cost=0.00..0.02 rows=1 width=32) (actual time=0.001..0.003 rows=1 loops=1)
                    Output: input_geom_1.geom
              ->  Bitmap Heap Scan on public.gap_stewardship_2008_all_ownership_types mown  (cost=4.32..25.30 rows=2 width=18053) (actual time=3.299..791.762 rows=39 loops=1)
                    Output: mown.gid, mown.wetland_ty, mown.county, mown.name, mown.unit, mown.owner, mown.owner_ver1, mown.owner_desc, mown.owner_name, mown.agency, mown.agncy_ver1, mown.agency_nam, mown.new_manage, mown.name_manag, mown.comments, mown.or (...)
                    Recheck Cond: (input_geom_1.geom && mown.geom)
                    Filter: _st_intersects(input_geom_1.geom, mown.geom)
                    Rows Removed by Filter: 208
                    Heap Blocks: exact=142
                    ->  Bitmap Index Scan on gap_stewardship_2008_all_ownership_types_geom_idx  (cost=0.00..4.31 rows=5 width=0) (actual time=0.651..0.651 rows=247 loops=1)
                          Index Cond: (input_geom_1.geom && mown.geom)
Planning time: 1.245 ms
Execution time: 49046.184 ms

以下是重新创建相关表格的SQL:
慢表(50,000行):

CREATE TABLE public.gap_stewardship_2008_all_ownership_types
(
  gid integer NOT NULL DEFAULT nextval('gap_stewardship_2008_all_ownership_types_gid_seq'::regclass),
  wetland_ty character varying(50),
  county character varying(50),
  name character varying(50),
  unit character varying(50),
  owner smallint,
  owner_ver1 smallint,
  owner_desc character varying(50),
  owner_name character varying(50),
  agency smallint,
  agncy_ver1 smallint,
  agency_nam character varying(50),
  new_manage smallint,
  name_manag character varying(50),
  comments character varying(100),
  origin character varying(50),
  area numeric,
  acres numeric,
  perfeet numeric,
  perimeter numeric,
  km2 numeric,
  shape_leng numeric,
  shape_area numeric,
  geom geometry(MultiPolygon,26915),
  CONSTRAINT gap_stewardship_2008_all_ownership_types_pkey PRIMARY KEY (gid)
)

快速表(1,300,000行):

CREATE TABLE public.nwi_combine
(
  id integer NOT NULL DEFAULT nextval('"NWI_combine_AOI_id_seq"'::regclass),
  geom geometry(MultiPolygon,26915),
  attribute character varying(254),
  wetland_ty character varying(254),
  acres numeric,
  hgm_code character varying(254),
  hgm_desc character varying(254),
  spcc_desc character varying(254),
  cow_class1 character varying(254),
  circ39_cla bigint,
  hgm_ll_des character varying(254),
  shape_leng numeric,
  shape_area numeric,
  nwi_code character varying(254),
  new_cow character varying(254),
  system character varying(254),
  subsystem character varying(254),
  class1 character varying(254),
  subclass1 character varying(254),
  class2 character varying(254),
  subclass2 character varying(254),
  wreg character varying(254),
  soilm character varying(254),
  spec_mod1 character varying(254),
  spec_mod2 character varying(254),
  circ39 character varying(254),
  old_cow character varying(254),
  mnwet character varying(254),
  circ39_com bigint,
  CONSTRAINT "NWI_combine_AOI_pkey" PRIMARY KEY (id)
)

这些表中的每一个都在几何字段上都有一个GIST索引。

有没有人知道可能导致这种差异的原因是什么?

0 个答案:

没有答案