使用sql with-clause慢速搜索最大值列表中的最小值

时间:2010-09-30 16:44:42

标签: sql postgresql performance aggregate-functions

我对此查询有疑问,对我来说有点慢:(

我的问题:我正在搜索列表中的最小刻度,其中包含每个刻度列表的最大值。

=>等于min(max(seed1),max(seed2),max(seed 3)等)

PS:每个种子包含~10000行(1行= 1个刻度) PS2:我有一个包含所有数据的表(这不是一个好的解决方案,我知道,但我在项目的这个时候别无选择)。模拟包含数百万行,嗡嗡声......如40 M:/ PS3:我在v_seed和v_exp上有索引。

执行此查询的时间 ~135秒,非常慢:(

 WITH summary AS ( SELECT v_ticks,v_seed, ROW_NUMBER() OVER(PARTITION BY v_seed 
 ORDER BY v_ticks DESC) as rk  FROM simulations 
 WHERE v_exp=23 AND v_seed IN (2133836778, -2061794068, 1260042744, -1324330098, -423279216, -685846464, 142959438, -1154715079, 1062336798,-624140595, -922352011, -613647601, -330177159, 1945002173, 131053356, -216538235, -636982783, 979930868, 321237028, -1103129161, 476235295, -1916604834, -54027108, 17850135, -60658084) ) 
 SELECT min(s.v_ticks)
 FROM summary s WHERE s.rk = 1

更新1:解释信息

Aggregate  (cost=6327697.46..6327697.47 rows=1 width=4)
  CTE summary
    ->  WindowAgg  (cost=5302458.61..5784782.09 rows=24116174 width=12)
         ->  Sort  (cost=5302458.61..5362749.05 rows=24116174 width=12)
                Sort Key: simulations.v_seed, simulations.v_ticks
                ->  Bitmap Heap Scan on simulations  (cost=415238.16..1933251.42 rows=24116174 width=12)
                      Recheck Cond: (v_seed = ANY ('{2133836778,-2061794068,1260042744,-1324330098,-423279216,-685846464,142959438,-1154715079,1062336798,-624140595,-922352011,-613647601,-330177159,1945002173,131053356,-216538235,-636982783,979930868,321237028,-1103129161,476235295,-1916604834,-54027108,17850135,-60658084}'::bigint[]))
                      Filter: (v_exp = 23)"
                          ->  Bitmap Index Scan on index_seed  (cost=0.00..409209.12 rows=25752303 width=0)
                            Index Cond: (v_seed = ANY ('{2133836778,-2061794068,1260042744,-1324330098,-423279216,-685846464,142959438,-1154715079,1062336798,-624140595,-922352011,-613647601,-330177159,1945002173,131053356,-216538235,-636982783,979930868,321237028,-1103129161,476235295,-1916604834,-54027108,17850135,-60658084}'::bigint[]))
  ->  CTE Scan on summary s  (cost=0.00..542613.92 rows=120581 width=4)
        Filter: (rk = 1)

如果您有想要优化此查询,那很酷:) 谢谢!

3 个答案:

答案 0 :(得分:1)

确保表格Simulations在v_seed上有一个聚集索引并尝试:

CREATE TABLE #Seeds(seed int)

INSERT #Seeds
SELECT  2133836778 UNION
SELECT  -2061794068 UNION
...
 SELECT -60658084

WITH summary AS
( 
SELECT v_ticks,v_seed, ROW_NUMBER() OVER(PARTITION BY v_seed ORDER BY v_ticks DESC) as rk  
FROM    simulations sim
    INNER JOIN
        #Seeds seeds
    ON sim.v_seed = seeds.seed

WHERE   v_exp   =23 
 ) 
SELECT  min(s.v_ticks)
FROM    summary s 
WHERE   s.rk = 1

答案 1 :(得分:0)

不一样
WITH summary AS( 
SELECT v_seed, MAX(v_ticks)
FROM simulations 
WHERE v_exp=23 AND v_seed IN (2133836778, -2061794068, 1260042744, -1324330098, -423279216, -685846464, 142959438, -1154715079, 1062336798,-624140595, -922352011, -613647601, -330177159, 1945002173, 131053356, -216538235, -636982783, 979930868, 321237028, -1103129161, 476235295, -1916604834, -54027108, 17850135, -60658084) )
GROUP BY v_seed)
SELECT MIN(v_ticks)

答案 2 :(得分:0)

这应该可以解决问题(对不起,没有postgres db可以测试,但它适用于mysql)。确保你有一个关于v_seed和v_exp的索引,因为这两行都用于过滤数据。

select min(s.max_tick) as smallest_max
from (select max(v_ticks) as max_tick
      from wts_test
      where v_exp=23 AND
            v_seed IN (2133836778, -2061794068, 1260042744, -1324330098, -423279216, -685846464, 142959438, -1154715079, 1062336798,-624140595, -922352011, -613647601, -330177159, 1945002173, 131053356, -216538235, -636982783, 979930868, 321237028, -1103129161, 476235295, -1916604834, -54027108, 17850135, -60658084)
  group by v_seed) s