Postgresql

时间:2019-05-29 22:13:36

标签: sql postgresql postgis

我的查询非常慢,我需要提高速度 (下面的数据库设计)

我需要构建一个查询,该查询可以:

  1. 首先,从半径范围内的 places_table 中选择所有 place_ids
  2. 然后,使用 place_ids items_table
  3. 中查找所有 item_ids
  4. 然后,通过以下任一方法过滤掉所选的 item_ids
    • 确定 item_id 是否包含 items_indexed_table 中的某个键(例如:SELECT * FROM items_indexed_table WHERE键='foo_bar')
    • 确定表 item_colors_table
    • 中该项是否具有某种颜色

我创建了一些查询,但是它们运行缓慢(> 1分钟)。我相信所有表都已正确索引并在运行EXPLAIN ANALYSE时使用。尝试应用上述所有步骤时,JOIN和WITH AS()查询无济于事。

但是,当我在一个查询中运行第1步和第2步时,它在不到500毫秒内返回结果,而当我在某个键和颜色上单独运行第3步时,它也在不到500毫秒内返回结果。问题是当我尝试加入他们时,要花很长时间才能得到回报。我运行EXPLAIN ANALYZE,发现步骤1和2最后执行。

以下是上面提到的两个查询:

with nearest_items AS (
  with nearest_places AS (
    SELECT place_id
    FROM places_table
    WHERE (ST_DWithin(places_table.geometry, ST_MakePoint(some_lon, some_lat)::geography, 1500))
  )
  SELECT items_table.item_id
  FROM items_table
  WHERE items_table.place_id IN (SELECT place_id FROM nearest_places)
) SELECT * FROM nearest_items;

计划时间:0.457毫秒

执行时间:36.420毫秒

但是当我将其与 item_indexed_table 结合使用时,速度非常慢。

with nearest_items AS (
  with nearest_places AS (
    SELECT place_id
    FROM places_table
    WHERE (ST_DWithin(places_table.geometry, ST_MakePoint(some_lon, some_lat)::geography, 1500))
  )
  SELECT items_table.item_id
  FROM items_table
  WHERE items_table.place_id IN (SELECT place_id FROM nearest_places)
) 
SELECT
       items_table.item_id
FROM
     items_table
WHERE
      items_tables.item_id in (
         SELECT *
         FROM nearest_items
         )
  and items_table.item_id in (
         SELECT item_id
         FROM item_indexed_table
         WHERE key = 'medicine'
         )
;

这里是解释分析

Merge Join  (cost=2003517.08..3623886.10 rows=1111500 width=37) (actual time=1834.616..29030.328 rows=61 loops=1)
  Merge Cond: (nearest_items.item_id = items_table.item_id)
  CTE nearest_items
    ->  Nested Loop  (cost=15731.91..1414499.28 rows=17150069 width=37) (actual time=24.501..30.230 rows=5562 loops=1)
          CTE nearest_restaurants
            ->  Gather  (cost=1956.86..15549.89 rows=1389 width=37) (actual time=0.729..24.410 rows=19 loops=1)
                  Workers Planned: 2
                  Workers Launched: 2
                  ->  Parallel Bitmap Heap Scan on places_table  (cost=956.86..14410.99 rows=579 width=37) (actual time=0.154..0.197 rows=6 loops=3)
                        Recheck Cond: ((geometry)::geography && '0101000020E61000008AE42B8194975DC02D776682E1F04040'::geography)
                        Filter: (('0101000020E61000008AE42B8194975DC02D776682E1F04040'::geography && _st_expand((geometry)::geography, '1500'::double precision)) AND _st_dwithin((geometry)::geography, '0101000020E61000008AE42B8194975DC02D776682E1F04040'::geography, '1500'::double precision, true))
                        Rows Removed by Filter: 3
                        Heap Blocks: exact=29
                        ->  Bitmap Index Scan on places_geometry_to_geography_index  (cost=0.00..956.51 rows=20831 width=0) (actual time=0.249..0.249 rows=29 loops=1)
                              Index Cond: ((geometry)::geography && '0101000020E61000008AE42B8194975DC02D776682E1F04040'::geography)
          ->  HashAggregate  (cost=31.25..33.25 rows=200 width=32) (actual time=24.446..24.458 rows=19 loops=1)
                Group Key: nearest_places.place_id
                ->  CTE Scan on nearest_places  (cost=0.00..27.78 rows=1389 width=32) (actual time=0.731..24.428 rows=19 loops=1)
          ->  Bitmap Heap Scan on items_table items_table_1  (cost=150.76..6975.28 rows=1930 width=74) (actual time=0.035..0.165 rows=293 loops=19)
                Recheck Cond: (place_id = nearest_places.place_id)
                Heap Blocks: exact=470
                ->  Bitmap Index Scan on items_table_place_id_index  (cost=0.00..150.28 rows=1930 width=0) (actual time=0.030..0.030 rows=293 loops=19)
                      Index Cond: (place_id = nearest_places.place_id)
  ->  Sort  (cost=589017.24..591795.99 rows=1111500 width=69) (actual time=117.743..117.810 rows=61 loops=1)
        Sort Key: items_table_index.item_id
        Sort Method: quicksort  Memory: 33kB
        ->  Nested Loop  (cost=385877.25..386217.98 rows=1111500 width=69) (actual time=39.721..117.632 rows=61 loops=1)
              ->  HashAggregate  (cost=385876.55..385878.55 rows=200 width=32) (actual time=36.690..39.319 rows=5562 loops=1)
                    Group Key: nearest_items.item_id
                    ->  CTE Scan on nearest_items  (cost=0.00..343001.38 rows=17150069 width=32) (actual time=24.504..33.625 rows=5562 loops=1)
              ->  Index Only Scan using items_table_index_pkey on itema_table_index  (cost=0.70..32.82 rows=32 width=37) (actual time=0.013..0.013 rows=0 loops=5562)
                    Index Cond: ((item_id = nearest_items.item_id) AND (key = 'medicine'::text))
                    Heap Fetches: 34
  ->  Index Only Scan using items_table_pkey on items_table  (cost=0.56..1517926.72 rows=34308144 width=37) (actual time=0.020..14910.737 rows=33838956 loops=1)
        Heap Fetches: 2567

这是显示数据库设计的图像

https://ibb.co/GV8MpqV

预期查询

SELECT item_id
FROM item_tables
WHERE (ST_DWithin(places_table.geometry, ST_MakePoint(some_lon, some_lat)::geography, 1500))
  AND (
          (item_indexed_table.key = 'medicine')
       OR (item_indexed_table.key = 'potato' AND item_indexed_table.key = 'chips')
      )
  AND (items_color_table.color_id = 'f32nr-kfr32')
  • 普通话:选择您所在位置半径内的所有商品ID,其中包含单词“ medicine”或单词“ potato”和“ chip”,而颜色ID为“ f32nr-kfr32”

预期结果

  • item_id的列表(应该在半径之内,并且应该能够按颜色和其他表中的键进行过滤)
  • 我的表很大,所以我希望执行时间尽可能短

这是桌子的大小

placse_table:约100000行

item_table:〜4000000行

item_index_table:> 100000000行

item_colors_table:> 1000000行

我将提供任何帮助或提示,谢谢!

0 个答案:

没有答案