Postgres正在使用简单连接为多个查询执行次优连接。它似乎正在进行笛卡尔连接,然后使用连接过滤器删除行,下面的解释计划显示:
Insert on sierra (cost=7.350..26.120 rows=1 width=88) (actual rows=0 loops=1)
-> Nested Loop (cost=7.350..26.080 rows=1 width=426) (actual rows=14356 loops=1)
-> Nested Loop (cost=6.240..19.470 rows=1 width=304) (actual rows=14356 loops=1)
Join Filter: (xray_two.echo_romeo = november_three.echo_romeo)
-> Nested Loop Left Join (cost=6.240..18.380 rows=1 width=184) (actual rows=14356 loops=1)
-> Nested Loop (cost=3.970..11.980 rows=1 width=187) (actual rows=14356 loops=1)
Join Filter: ((lima.victor = xray_two.victor) AND (lima.three = xray_two.three))
Rows Removed by Join Filter: 3813025380
-> Index Scan using foxtrot on echo_two lima (cost=1.720..5.590 rows=1 width=185) (actual rows=14356 loops=1)
Index Cond: ((delta = 'uniform'::timestamp without time zone) AND (zulu_tango = 1))
-> Index Scan using november_hotel on mike xray_two (cost=2.250..6.330 rows=1 width=18) (actual rows=265606 loops=14356)
Index Cond: ((delta = 'uniform'::timestamp without time zone) AND (zulu_tango = 1))
-> Index Scan using zulu_six on india whiskey (cost=2.270..6.360 rows=1 width=17) (actual rows=1 loops=14356)
Index Cond: ((lima.delta = delta) AND (delta = 'uniform'::timestamp without time zone) AND (lima.victor = victor))
-> Seq Scan on kilo november_three (cost=0.000..1.040 rows=1 width=124) (actual rows=1 loops=14356)
-> Limit (cost=1.110..1.130 rows=1 width=130) (actual rows=1 loops=14356)
-> Sort (cost=1.110..1.130 rows=1 width=130) (actual rows=1 loops=14356)
Sort Key: xray_delta.six DESC
Sort Method: quicksort Memory: 25kB
-> Seq Scan on xray_delta (cost=0.000..1.070 rows=1 width=130) (actual rows=1 loops=1)
Filter: (six < ('juliet'::cstring)::timestamp without time zone)
您可以看到连接过滤器删除了~4b行。 Postgres只期待1行,但是november_hotel返回的实际行数为265k。然后循环265k行14k次。是什么导致规划人员执行这种低效的连接/过滤方案?有些要点需要注意:
导致postgres错误的原因是什么?在执行有问题的查询之前,我可以通过手动分析来解决问题,但我不认为这解决了潜在的问题,这可能会在我的代码中的任何地方出现。
修改
根据要求,我已将查询简化为最基本级别,从两个表中选择*,加入所有索引字段。
方法:我插入了一个新的小时数据,然后立即进行了解释分析。当在4000s之后完成时,我再次运行相同的解释分析(此时postgres已经自动恢复了表格),这在半秒内返回。唯一的区别是table_b的嵌套循环中返回的实际行。
查询:
explain analyze
select col.*, strat.*
FROM table_a col
JOIN table_b strat
ON (strat.cellkey = col.cellkey
AND strat.offerkey = col.offerkey
AND strat.strategykey = col.strategykey
AND strat.startdate = col.startdate)
where col.startdate = '2017-05-17 1700'
AND col.strategykey = 1;
第一个解释:
Nested Loop (cost=4.51..13.48 rows=1 width=544) (actual time=6.210..4264064.949 rows=31169 loops=1)
Join Filter: ((col.cellkey = strat.cellkey) AND (col.offerkey = strat.offerkey))
Rows Removed by Join Filter: 8278642245
-> Index Scan using table_a_1 on table_a col (cost=2.24..6.76 rows=1 width=494) (actual time=0.034..177.203 rows=31169 loops=1)
Index Cond: ((startdate = '2017-05-17 17:00:00'::timestamp without time zone) AND (strategykey = 1))
-> Index Scan using table_b_1 on table_b strat (cost=2.27..6.66 rows=1 width=50) (actual time=0.020..94.664 rows=265606 loops=31169)
Index Cond: ((startdate = '2017-05-17 17:00:00'::timestamp without time zone) AND (strategykey = 1))
Planning time: 4.689 ms
Execution time: 4264074.251 ms
第二个解释:
Nested Loop (cost=4.51..341069.90 rows=36588 width=545) (actual time=0.290..538.989 rows=31169 loops=1)
-> Index Scan using table_a_1 on table_a col (cost=2.24..73371.98 rows=36662 width=495) (actual time=0.168..81.488 rows=31169 loops=1)
Index Cond: ((startdate = '2017-05-17 17:00:00'::timestamp without time zone) AND (strategykey = 1))
-> Index Scan using table_b_1 on table_b strat (cost=2.27..7.26 rows=1 width=50) (actual time=0.012..0.013 rows=1 loops=31169)
Index Cond: ((startdate = '2017-05-17 17:00:00'::timestamp without time zone) AND (cellkey = col.cellkey) AND (strategykey = 1) AND (offerkey = col.offerkey))
Planning time: 10.053 ms
Execution time: 543.467 ms
我们如何解释这个?表格经常插入数据,即使采用超强力吸尘技术,我们也不能保证对我们运行的每个查询都进行新的分析。