以下查询需要将近15分钟才能显示结果。我想知道为什么?因为数据?或几何的顶点?当我用不同的表(小尺寸shapefile)尝试查询时,它运行得很快。
这是查询。 (感谢Patrick):
WITH hi AS (
SELECT ps.id, ps.brgy_locat, ps.municipali
FROM evidensapp_polystructures ps
JOIN evidensapp_seniangcbr fh ON fh.hazard = 'High'
AND ST_Intersects(fh.geom, ps.geom)
), med AS (
SELECT ps.id, ps.brgy_locat, ps.municipali
FROM evidensapp_polystructures ps
JOIN evidensapp_seniangcbr fh ON fh.hazard = 'Medium'
AND ST_Intersects(fh.geom, ps.geom)
EXCEPT SELECT * FROM hi
), low AS (
SELECT ps.id, ps.brgy_locat, ps.municipali
FROM evidensapp_polystructures ps
JOIN evidensapp_seniangcbr fh ON fh.hazard = 'Low'
AND ST_Intersects(fh.geom, ps.geom)
EXCEPT SELECT * FROM hi
EXCEPT SELECT * FROM med
)
SELECT brgy_locat AS barangay, municipali AS municipality, high, medium, low
FROM (SELECT brgy_locat, municipali, count(*) AS high
FROM hi
GROUP BY 1, 2) cnt_hi
FULL JOIN (SELECT brgy_locat, municipali, count(*) AS medium
FROM med
GROUP BY 1, 2) cnt_med USING (brgy_locat, municipali)
FULL JOIN (SELECT brgy_locat, municipali, count(*) AS low
FROM low
GROUP BY 1, 2) cnt_low USING (brgy_locat, municipali);
PostgreSQL 9.3,PostGIS 2.1.5
表Polystructures
:包含9847行:
CREATE TABLE evidensapp_polystructures (
id serial NOT NULL PRIMARY KEY,
bldg_name character varying(100) NOT NULL,
bldg_type character varying(50) NOT NULL,
brgy_locat character varying(50) NOT NULL,
municipali character varying(50) NOT NULL,
province character varying(50) NOT NULL,
geom geometry(MultiPolygon,32651)
);
CREATE INDEX evidensapp_polystructures_geom_id
ON evidensapp_polystructures USING gist (geom);
ALTER TABLE evidensapp_polystructures CLUSTER ON evidensapp_polystructures_geom_id;
表SeniangCBR
:只有6行,shapefile大小(如果重要):52,060 KB
CREATE TABLE evidensapp_seniangcbr (
id serial NOT NULL PRIMARY KEY,
hazard character varying(16) NOT NULL,
geom geometry(MultiPolygon,32651)
);
CREATE INDEX evidensapp_seniangcbr_geom_id ON evidensapp_seniangcbr USING gist (geom);
ALTER TABLE evidensapp_seniangcbr CLUSTER ON evidensapp_seniangcbr_geom_id;
在我使用LayerMapping时,使用Django(GeoDjango)实用程序自动将所有数据加载到数据库中。
我现在没有服务器,我在电脑上运行查询。
答案 0 :(得分:2)
EXPLAIN ANALYZE
输出难以阅读,因为所有字段和函数都被加密为radio alphabet。也就是说,有两点突出:
ST_Intersects()
函数上,这并不奇怪。EXCEPT
条款似乎效率也很低。所以请试试这个,而不是那么冗长的版本:
SELECT brgy_locat AS barangay, municipali AS municipality,
sum(CASE max_hz_id WHEN 3 THEN 1 ELSE 0 END) AS high,
sum(CASE max_hz_id WHEN 2 THEN 1 ELSE 0 END) AS medium,
sum(CASE max_hz_id WHEN 1 THEN 1 ELSE 0 END) AS low
FROM (
SELECT ps.id, ps.brgy_locat, ps.municipali,
max(CASE fh.hazard WHEN 'Low' THEN 1 WHEN 'Medium' THEN 2 WHEN 'High' THEN 3 END) AS max_hz_id
FROM evidensapp_polystructures ps
JOIN evidensapp_seniangcbr fh ON ST_Intersects(fh.geom, ps.geom)
GROUP BY 1, 2, 3
) AS ps_fh
GROUP BY 1, 2;
现在只有一次调用ST_Intersects()
,这可能(希望)比危险地图子集上的三次调用快得多(由于PostGIS代码的内部效率)。
很明显,危险类别字符串被转换为一系列整数,便于订购和比较。在内部查询中,根据您的要求选择最大危险值。在主查询中,每个结构的最大值被加到它们各自的列中。如果可能的话,更改表结构以使用这三个整数代码并链接到类标签的帮助器表:您的表会变得更小,因此更快,内部查询中的CASE
语句可能会被删除。或者,添加一个包含整数代码的列,并根据" hazard"更新值。列。
请注意,这些CASE
语句效率不高(我在上一个答案中使用EXCEPT
子句的原因)。在PG 9.4中,引入了关于聚合函数的新FILTER
子句,这将使查询更快更容易阅读:
count(id) FILTER (WHERE max_hz_id = 3) AS high
您可能需要考虑升级。
Selamat mula Maynila
答案 1 :(得分:1)
在表格中添加bounding_box geometry(Polygon,4326)
列。该列的值将是一个完全封装multipolygon
的边界框({x 1的} x,y和min x,y)。
然后您的查询将如下所示:
multipolygon
这样做的好处是第一个AND ST_Intersects(fh.bounding_box, ps.bounding_box)
AND ST_Intersects(fh.geom, ps.geom)
电话非常快。如果它返回false,则永远不会调用第二个更复杂的ST_Intersects
调用,在这种情况下可以节省一些时间。
答案 2 :(得分:1)
与suggested and explained under your related question类似,我会在外部UNION ALL
使用FULL JOIN
代替SELECT
。
WITH hi AS (
SELECT ps.brgy_locat, ps.municipali, fh.hazard, count(*) AS ct
FROM evidensapp_seniangcbr fh
JOIN evidensapp_polystructures ps ON ST_Intersects(fh.geom, ps.geom)
WHERE fh.hazard = 'High'
GROUP BY 1, 2, 3
)
, med AS (
SELECT ps.brgy_locat, ps.municipali, fh.hazard, count(*) AS ct
FROM evidensapp_seniangcbr fh
JOIN evidensapp_polystructures ps ON ST_Intersects(fh.geom, ps.geom)
LEFT JOIN hi USING (brgy_locat, municipali)
WHERE fh.hazard = 'Medium'
AND hi.brgy_locat IS NULL
GROUP BY 1, 2, 3
)
TABLE hi
UNION ALL
TABLE med
UNION ALL
SELECT ps.brgy_locat, ps.municipali, fh.hazard, count(*) AS ct
FROM evidensapp_seniangcbr fh
JOIN evidensapp_polystructures ps ON ST_Intersects(fh.geom, ps.geom)
LEFT JOIN hi USING (brgy_locat, municipali)
LEFT JOIN med USING (brgy_locat, municipali)
WHERE fh.hazard = 'Low'
AND hi.brgy_locat IS NULL
AND med.brgy_locat IS NULL
GROUP BY 1, 2, 3;
这仅考虑具有相同(brgy_locat, municipali)
的每组行的最高危险等级。只有与evidensapp_seniangcbr
中相关危险等级的任何行实际相交的行才会出现在结果中。此外,计数仅计算实际相交的行。 (brgy_locat, municipali)
中可能有更多行具有相同的evidensapp_polystructures
,只是不与相同的危险等级相交,因此会被忽略。
选择一种标准方法,以排除已在较低级别的较高危险级别找到匹配项的行。
LEFT JOIN
/ IS NULL
应使用id
上的索引并在此处表现非常好。当然比基于整行的EXCEPT
更快,而不能使用索引。
你 不 需要在你的表中添加一个bounding_box几何列,就像建议的另一个答案一样。 PostGIS在现代版本中使用(索引支持的)边界框比较自动。 The PostGIS documentation:
此函数调用将自动包含一个边界框 比较将使用几何上可用的任何索引。
事实上,我们已经在explain output you posted.
中看到了索引扫描您现有的GiST索引evidensapp_polystructures_geom_id
应该可以快速查询
旁边:索引的名称应该是evidensapp_polystructures_geom_idx
。
此外,如果您还没有,请在(brgy_locat, municipali)
上创建一个索引:
CREATE INDEX foo_idx ON evidensapp_polystructures (brgy_locat, municipali);
LATERAL
加入由于evidensapp_seniangcbr
只有6行,LATERAL
加入可能更快:
WITH hi AS (
SELECT ps.brgy_locat, ps.municipali, fh.hazard, count(*) AS ct
FROM evidensapp_seniangcbr fh
, LATERAL (
SELECT ps.brgy_locat, ps.municipali
FROM evidensapp_polystructures ps
WHERE ST_Intersects(fh.geom, ps.geom)
) ps
WHERE fh.hazard = 'High'
GROUP BY 1, 2, 3
)
, med AS (
SELECT ps.brgy_locat, ps.municipali, fh.hazard, count(*) AS ct
FROM evidensapp_seniangcbr fh
, LATERAL (
SELECT ps.brgy_locat, ps.municipali
FROM evidensapp_polystructures ps
LEFT JOIN hi USING (brgy_locat, municipali)
WHERE hi.brgy_locat IS NULL
AND ST_Intersects(fh.geom, ps.geom)
) ps
WHERE fh.hazard = 'Medium'
GROUP BY 1, 2, 3
)
TABLE hi
UNION ALL
TABLE med
UNION ALL
SELECT ps.brgy_locat, ps.municipali, fh.hazard, count(*) AS ct
FROM evidensapp_seniangcbr fh
, LATERAL (
SELECT ps.id, ps.brgy_locat, ps.municipali
FROM evidensapp_polystructures ps
LEFT JOIN hi USING (brgy_locat, municipali)
LEFT JOIN med USING (brgy_locat, municipali)
WHERE hi.brgy_locat IS NULL
AND med.brgy_locat IS NULL
AND ST_Intersects(fh.geom, ps.geom)
) ps
WHERE fh.hazard = 'Low'
GROUP BY 1, 2, 3;
关于LATERAL
加入: