优化Postgis查询并了解效果

时间:2017-11-12 21:24:22

标签: sql postgresql performance geocoding postgis

我在数字海洋服务器上有一个数据库,对我来说似乎有点慢(有时超过一秒)。 Postgis的Postgresql正在那里运行。

以下是有关数据库房屋的一些统计数据,实际上只存储了一些公寓:

房屋:190000

SELECT count(*) from houses;

过去24小时内上线的房屋:58000

SELECT count(*) FROM houses 
JOIN (select max(last_seen) as last_ts from houses) as dt 
ON last_seen >= dt.last_ts - interval '24 hour';

位于特定区域且有效的房屋:3086

 select count(*) from houses 
 where ST_DWithin(geom, ST_MakePoint(52.5277411, 13.4)::geography,30000)
                 (active IS NULL OR active = TRUE)

这是实际的SQL查询,有点慢。慢意味着一个查询有时需要超过一秒钟:

SELECT
      *,
      ST_DistanceSphere(geom, ST_MakePoint(52.5277411, 13.4)) as distace
      FROM houses 
      JOIN (select max(last_seen) as last_ts from houses) as dt 
      ON last_seen >= dt.last_ts - interval '24 hour'
      WHERE  
        ST_DWithin(geom, ST_MakePoint(52.5277411, 13.4)::geography,30000)
        AND (active IS NULL OR active = TRUE)

到目前为止我尝试了什么。删除连接,因为它有点多余。介绍指数。

以下是查询说明:

enter image description here

任何想法如何改进?非常感谢!

PS:如果缺少某些数据,请告诉我,我会提供。

这里与解释分析相同: enter image description here

数据库指数: enter image description here

1 个答案:

答案 0 :(得分:1)

因为很多人都试图提供帮助并给出了非常好的建议我想发布我的最终解决方案: 正如评论中所提到的,您应该始终测量,优化,重复。表格大小和指数是关键点。

由于我不是这个主题的专家,因此可视化对http://tatiyants.com

的帮助很大
 Explain (ANALYZE, COSTS, VERBOSE, BUFFERS, FORMAT JSON)
 select
  *,
  ST_DistanceSphere(geom, ST_MakePoint(52.5277411, 13.4)) as distace
  FROM houses
  JOIN (select max(last_seen) as last_ts from houses) as dt
  ON last_seen >= dt.last_ts - interval '24 hour'
  WHERE
    ST_DWithin(geom, ST_MakePoint(52.5277411, 13.4)::geography,30000)
    AND (active IS NULL OR active = TRUE);

query visualisation

这有助于基本的理解。由于我已经在使用索引,因此没有那么多可能的优化。在我的情况下,可以得到一点延迟的结果。我介绍了存储查询的一部分的物化视图:

CREATE MATERIALIZED VIEW mathouses
 select
  *,
  FROM houses
  JOIN (select max(last_seen) as last_ts from houses) as dt
  ON last_seen >= dt.last_ts - interval '24 hour'
  WHERE (active IS NULL OR active = TRUE);

然后在该视图上添加了索引。并添加了一个简单的shell脚本,每小时由cron调用:

#!/bin/sh
sudo -u <myuser>-Hi -- psql -d <db> -c 'refresh materialized view mathouses;'

我的最终结果:

 Explain (ANALYZE, COSTS, VERBOSE, BUFFERS, FORMAT JSON)
 select
  *,
  ST_DistanceSphere(geom, ST_MakePoint(52.5277411, 13.4)) as distace
  FROM mathouses
  WHERE ST_DWithin(geom, ST_MakePoint(52.5277411, 13.4)::geography,30000);

query visualisation

对解决方案非常满意。它现在是3倍甚至更快的因素。为了更进一步,下一个逻辑步骤是查看硬件或优化postgresql设置。