如何使用子查询优化sql查询,也许通过横向连接?

时间:2018-01-20 23:41:12

标签: sql postgresql optimization postgis lateral-join

我试图优化复杂的SQL查询,它会在每个地图绑定框更改时执行。我认为INNER LATERAL JOIN会最快,但不是。有谁知道如何加快此查询以及如何更好地利用LATERAL JOIN

我所做的最快的查询:

SELECT r0."id", r0."name" 
FROM "hiking"."routes" AS r0 
INNER JOIN "hiking"."hierarchy" AS h1 ON r0."id" = h1."parent" 
INNER JOIN (SELECT DISTINCT unnest(s0."rels") AS "rel" 
            FROM "hiking"."segments" AS s0 
            WHERE (ST_Intersects(s0."geom", ST_SetSrid(ST_MakeBox2D(ST_GeomFromText('POINT(1285982.015631 7217169.814674)', -1), ST_GeomFromText('POINT(2371999.313507 6454022.524275)', -1)), 3857)))) AS s2 ON TRUE 
WHERE (s2."rel" = h1."child");
  

计划时间:~0.605 ms执行时间:~37.232 ms

实际上与上面的相同但是LATERAL JOIN,它是否更慢?

SELECT r0."id", r0."name" 
FROM "hiking"."routes" AS r0 
INNER JOIN "hiking"."hierarchy" AS h1 ON r0."id" = h1."parent" 
INNER JOIN LATERAL (SELECT DISTINCT unnest(s0."rels") AS "rel" 
                    FROM "hiking"."segments" AS s0 
                    WHERE (ST_Intersects(s0."geom", ST_SetSrid(ST_MakeBox2D(ST_GeomFromText('POINT(1285982.015631 7217169.814674)', -1), ST_GeomFromText('POINT(2371999.313507 6454022.524275)', -1)), 3857)))) AS s2 ON TRUE 
WHERE (s2."rel" = h1."child");
  

计划时间:~1.335 ms执行时间:~38.518 ms

在子查询中使用子查询进行最慢查询(这是我的第一次,所以我对其进行了一些改进):

SELECT r0."id", r0."name" 
FROM "hiking"."routes" AS r0 
INNER JOIN (SELECT DISTINCT h0."parent" AS "parent" 
            FROM "hiking"."hierarchy" AS h0 
            INNER JOIN (SELECT DISTINCT unnest(s0."rels") AS "rel" 
                        FROM "hiking"."segments" AS s0 
                        WHERE (ST_Intersects(s0."geom", ST_SetSrid(ST_MakeBox2D(ST_GeomFromText('POINT(1285982.015631 7217169.814674)', -1), ST_GeomFromText('POINT(2371999.313507 6454022.524275)', -1)), 3857)))) AS s1 ON TRUE 
            WHERE (h0."child" = s1."rel")) AS s1 ON TRUE 
WHERE (r0."top" AND (r0."id" = s1."parent"));
  

计划时间:~1.017 ms执行时间:~41.288 ms

1 个答案:

答案 0 :(得分:3)

如果没有任何关于您的数据库的知识,很难重现您的查询逻辑,但我会尝试,所以请耐心等待:

SELECT r0."id", r0."name" 
FROM "hiking"."routes" AS r0 
INNER JOIN "hiking"."hierarchy" AS h1 ON r0."id" = h1."parent" 
WHERE 
  EXISTS (
    SELECT 1
    FROM "hiking"."segments" AS s0 
    WHERE (
      ST_Intersects(
        s0."geom",
        ST_SetSrid(ST_MakeBox2D(ST_GeomFromText('POINT(1285982.015631 7217169.814674)', -1), ST_GeomFromText('POINT(2371999.313507 6454022.524275)', -1)),
        3857)))
      AND array[h1."child"] <@ s0."rels");

有两点:

  1. 有时加快EXISTSNOT EXISTS过滤数据
  2. 您可以使用数组比较运算符,而不是取消数组字段以将其元素与某些值进行比较。拥有适当的GIN索引会更快(文档herehere)。
  3. 以下是如何在数组上使用索引以及如何更快地使用索引的简单示例:

    create table foo(bar int[]);
    insert into foo(bar) select array[1,2,3,x] from generate_series(1,1000000) as x;
    create index idx on foo using gin (bar); // Note this
    select * from foo where 666 in (select unnest(bar)); // 6936,345 ms on my HW
    select * from foo where array[666] <@ bar; // 45,524 ms