优化慢速SQL查询,在两个单独的步骤中运行时速度很快

时间:2012-12-14 10:34:20

标签: sql postgresql

我无法优化以下SQL查询(使用postgresql 9.1):

WITH regions AS (
    SELECT r1.region_id
      FROM region r1, 
           (SELECT * 
              FROM region 
             WHERE region_id = 1) r2
     WHERE (r1.region_country = r2.region_country
             OR r2.region_country = 0) 
       AND (r1.region_province = r2.region_province 
             OR r2.region_province = 0) 
       AND (r1.region_area = r2.region_area 
             OR r2.region_area = 0))

SELECT id 
  FROM users 
 WHERE user_region in (SELECT region_id 
                         FROM regions);

解释产生以下输出

Nested Loop  (cost=85.02..42405.93 rows=13217 width=4) (actual time=0.447..970.132 rows=527444 loops=1)                                                                                                                          
  Buffers: shared hit=464136                                                                                                                                                                                                     
  CTE regions                                                                                                                                                                                                                    
    ->  Nested Loop  (cost=0.00..32.11 rows=5 width=4) (actual time=0.029..0.237 rows=135 loops=1)                                                                                                                               
          Join Filter: (((r1.region_country = region.region_country) OR (region.region_country = 0)) AND ((r1.region_province = region.region_province) OR (region.region_province = 0)) AND ((r1.region_area = region.region_area) OR (region.region_area = 0))) 
          Buffers: shared hit=7                                                                                                                                                                                                  
          ->  Index Scan using region_pkey on region  (cost=0.00..8.27 rows=1 width=6) (actual time=0.015..0.016 rows=1 loops=1)                                                                                                 
                Index Cond: (re_nr = 1)                                                                                                                                                                                          
                Buffers: shared hit=3                                                                                                                                                                                            
          ->  Seq Scan on region r1  (cost=0.00..9.67 rows=567 width=10) (actual time=0.007..0.072 rows=567 loops=1)                                                                                                             
                Buffers: shared hit=4                                                                                                                                                                                            
  ->  HashAggregate  (cost=0.11..0.16 rows=5 width=4) (actual time=0.326..0.449 rows=135 loops=1)                                                                                                                                
        Buffers: shared hit=7                                                                                                                                                                                                    
        ->  CTE Scan on regions  (cost=0.00..0.10 rows=5 width=4) (actual time=0.032..0.278 rows=135 loops=1)                                                                                                                    
              Buffers: shared hit=7                                                                                                                                                                                              
  ->  Bitmap Heap Scan on users  (cost=52.79..8441.69 rows=2643 width=8) (actual time=1.442..6.459 rows=3907 loops=135)                                                                                                   
        Recheck Cond: (user_region = regions.region_id)                                                                                                                                                                            
        Buffers: shared hit=464129                                                                                                                                                                                               
        ->  Bitmap Index Scan on user_region  (cost=0.00..52.13 rows=2643 width=0) (actual time=0.675..0.675 rows=3909 loops=135)                                                                                              
              Index Cond: (user_region = regions.region_id)                                                                                                                                                                        
              Buffers: shared hit=1847                                                                                                                                                                                           
Total runtime: 1003.867 ms                                                                                                                                                                                                       

如果我只是添加区域查询的输出,那么一切都和预期的一样快。

SELECT id
FROM users
WHERE user_region in (1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110)

解释产生以下输出。

Bitmap Heap Scan on users  (cost=5643.57..135774.21 rows=322812 width=4) (actual time=138.339..365.676 rows=527444 loops=1)                                                                                                                                                                                                                                                                                                                                                                         
  Recheck Cond: (user_region = ANY ('{1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110}'::integer[]))     
  Buffers: shared hit=72973 read=1302                                                                                                                                                                                                                                                                                                                                                                                                                                                                      
  ->  Bitmap Index Scan on user_region  (cost=0.00..5562.86 rows=322812 width=0) (actual time=114.446..114.446 rows=527752 loops=1)                                                                                                                                                                                                                                                                                                                                                                      
        Index Cond: (user_region = ANY ('{1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110}'::integer[])) 
        Buffers: shared hit=546 read=1301                                                                                                                                                                                                                                                                                                                                                                                                                                                                  
Total runtime: 397.975 ms                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  

单独计算区域查询也非常快。

Nested Loop  (cost=0.00..32.11 rows=5 width=4) (actual time=0.059..12.323 rows=135 loops=1)                                                                                                                              
  Join Filter: (((r1.region_country = region.region_country) OR (region.region_country = 0)) AND ((r1.region_province = region.region_province) OR (region.region_province = 0)) AND ((r1.region_area = region.region_area) OR (region.region_area = 0))) 
  Buffers: shared hit=1 read=6                                                                                                                                                                                           
  ->  Index Scan using region_pkey on region  (cost=0.00..8.27 rows=1 width=6) (actual time=0.044..0.046 rows=1 loops=1)                                                                                                 
        Index Cond: (re_nr = 1)                                                                                                                                                                                          
        Buffers: shared read=3                                                                                                                                                                                           
  ->  Seq Scan on region r1  (cost=0.00..9.67 rows=567 width=10) (actual time=0.005..12.122 rows=567 loops=1)                                                                                                            
        Buffers: shared hit=1 read=3                                                                                                                                                                                     
Total runtime: 12.379 ms                                                                                                                                                                                                

如果我向select from users添加更多列,则两种不同方式之间的时差会变得更大。

有没有办法在一个快速查询中计算所有内容?

非常感谢任何帮助或对解决方案的指示。

[edit] 根据评论中的请求添加区域表的样本 用户可以选择区域(user_region),其可以是国家,省或城市/城市的一部分。 区域查询尝试查找该国家/地区,省或城市中的所有region_ids。 如果用户选择奥地利(region_id = 1),则应返回来自奥地利的所有其他region_ids。如果用户选择"下奥地利" (region_id = 26),应返回来自下奥地利省的所有地区(在样本数据27,28,29,30中)。

select * from region limit 30;
 region_country | region_province | region_area |     region_name     | region_id 
----------------+-----------------+-------------+---------------------+-----------
              1 |               0 |           0 | Austria             |         1
              1 |               1 |           0 | Vienna              |         2
              1 |               1 |           1 | Vienna 1            |         3
              1 |               1 |           2 | Vienna 2            |         4
              1 |               1 |           3 | Vienna 3            |         5
              1 |               1 |           4 | Vienna 4            |         6
              1 |               1 |           5 | Vienna 5            |         7
              1 |               1 |           6 | Vienna 6            |         8
              1 |               1 |           7 | Vienna 7            |         9
              1 |               1 |           8 | Vienna 8            |        10
              1 |               1 |           9 | Vienna 9            |        11
              1 |               1 |          10 | Vienna 10           |        12
              1 |               1 |          11 | Vienna 11           |        13
              1 |               1 |          12 | Vienna 12           |        14
              1 |               1 |          13 | Vienna 13           |        15
              1 |               1 |          14 | Vienna 14           |        16
              1 |               1 |          15 | Vienna 15           |        17
              1 |               1 |          16 | Vienna 16           |        18
              1 |               1 |          17 | Vienna 17           |        19
              1 |               1 |          18 | Vienna 18           |        20
              1 |               1 |          19 | Vienna 19           |        21
              1 |               1 |          20 | Vienna 20           |        22
              1 |               1 |          21 | Vienna 21           |        23
              1 |               1 |          22 | Vienna 22           |        24
              1 |               1 |          23 | Vienna 23           |        25
              1 |               2 |           0 | Lower Austria       |        26
              1 |               2 |           1 | St.Pölten           |        27
              1 |               2 |           2 | Amstetten           |        28
              1 |               2 |           3 | Baden               |        29
              1 |               2 |           4 | Bruck an der Leitha |        30

2 个答案:

答案 0 :(得分:1)

join通常比in子句更有效:

.
.
.
SELECT id FROM users 
  INNER JOIN regions ON user_region = region_id;

假设每个用户只匹配一个区域(从您的查询中看似真实),这将为您提供相同的结果。

答案 1 :(得分:0)

你试过分析你的桌子吗?

根据您发布的说明,我可以看到,Postgres预计更少行的次数比实际返回的次数多39倍。

当Postgres的期望与实际结果集大不相同时,它可以选择次优计划,从而产生较差的查询计划并且花费更长时间来完成查询。