我在Postgres中使用两个表连接进行了相对适度的查询,但是当我在开发环境中对生产环境运行查询时,性能却大不相同。
这是查询:
select count(seat_id) as avail, ev.event_name, price_code,
(case when substring(section_name, 4, 1) = 'A' then substring(section_name, 1, 3)
when row_name < '9999' then section_name
else section_name || 'C'
end) as section_name_full, class_name
from tm_availseats3_exp seats join tm_event_map ev on ev.event_name = seats.event_name
where event_sub_type = 'General'
group by ev.event_name, price_code, section_name_full, class_name, row_name
两个环境中的数据与索引相同。我已在两个环境中使用&#34; Analyze Explain&#34;运行查询。并获得以下结果。
这很快:
HashAggregate (cost=29061.69..29229.88 rows=7475 width=41) (actual time=662.006..682.448 rows=17444 loops=1)
Group Key: ev.event_name, seats.price_code, CASE WHEN ("substring"((seats.section_name)::text, 4, 1) = 'A'::text) THEN "substring"((seats.section_name)::text, 1, 3) WHEN ((seats.row_name)::text < '9999'::text) THEN (seats.section_name)::text ELSE ((seats.section_name)::text || 'C'::text) END, seats.class_name, seats.row_name
-> Nested Loop (cost=1090.79..28949.57 rows=7475 width=41) (actual time=2.267..488.597 rows=110977 loops=1)
-> HashAggregate (cost=784.42..784.44 rows=1 width=51) (actual time=2.076..2.163 rows=61 loops=1)
Group Key: ev_1.event_name, ev.event_name, ev_1.event_date, ev.event_name_long, ev.event_time, ev.event_day, CASE WHEN ("substring"((ev.event_name)::text, 1, 4) = 'EUCB'::text) THEN 'General'::text ELSE 'Premium'::text END
-> Nested Loop (cost=558.96..784.41 rows=1 width=51) (actual time=0.997..1.967 rows=61 loops=1)
-> HashAggregate (cost=558.68..558.78 rows=10 width=12) (actual time=0.953..1.021 rows=61 loops=1)
Group Key: ev_1.event_name, ev_1.event_date
-> Seq Scan on tm_evnt3 ev_1 (cost=0.00..558.63 rows=10 width=12) (actual time=0.035..0.876 rows=61 loops=1)
Filter: ("substring"((event_name)::text, 1, 4) = 'EUCB'::text)
Rows Removed by Filter: 1981
-> Index Scan using idx_tm_evnt3__event_date on tm_evnt3 ev (cost=0.28..22.54 rows=1 width=43) (actual time=0.006..0.011 rows=1 loops=61)
Index Cond: (event_date = ev_1.event_date)
Filter: (("substring"((event_name)::text, 1, 4) <> 'PARK'::text) AND ("substring"((event_name)::text, 1, 5) <> 'PROMO'::text) AND ("substring"((event_name)::text, length((event_name)::text), 1) <> 'P'::text) AND (CASE WHEN ("substring"((event_name)::text, 1, 4) = 'EUCB'::text) THEN 'General'::text ELSE 'Premium'::text END = 'General'::text))
Rows Removed by Filter: 5
-> Bitmap Heap Scan on tm_availseats3_exp seats (cost=306.36..27996.93 rows=7475 width=41) (actual time=0.194..2.352 rows=1819 loops=61)
Recheck Cond: ((event_name)::text = (ev.event_name)::text)
Heap Blocks: exact=12875
-> Bitmap Index Scan on tm_availseats3_exp_on_event (cost=0.00..304.50 rows=7475 width=0) (actual time=0.168..0.168 rows=1819 loops=61)
Index Cond: ((event_name)::text = (ev.event_name)::text)
Planning time: 0.498 ms
Execution time: 700.538 ms
这真的非常慢:
HashAggregate (cost=1083030.39..1083267.27 rows=10528 width=41) (actual time=107897.847..107918.705 rows=17444 loops=1)
Group Key: ev.event_name, seats.price_code, CASE WHEN ("substring"((seats.section_name)::text, 4, 1) = 'A'::text) THEN "substring"((seats.section_name)::text, 1, 3) WHEN ((seats.row_name)::text < '9999'::text) THEN (seats.section_name)::text ELSE ((seats.section_name)::text || 'C'::text) END, seats.class_name, seats.row_name
-> Hash Join (cost=795.21..1082872.47 rows=10528 width=41) (actual time=47773.210..107704.968 rows=110977 loops=1)
Hash Cond: ((seats.event_name)::text = (ev.event_name)::text)
-> Seq Scan on tm_availseats3_exp seats (cost=0.00..1052862.73 rows=7727373 width=41) (actual time=3352.769..103536.131 rows=3609106 loops=1)
-> Hash (cost=795.20..795.20 rows=1 width=8) (actual time=2.364..2.364 rows=61 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 3kB
-> Subquery Scan on ev (cost=795.18..795.20 rows=1 width=8) (actual time=2.107..2.292 rows=61 loops=1)
-> HashAggregate (cost=795.18..795.19 rows=1 width=51) (actual time=2.104..2.169 rows=61 loops=1)
Group Key: ev_2.event_name, ev_1.event_name, ev_2.event_date, ev_1.event_name_long, ev_1.event_time, ev_1.event_day, CASE WHEN ("substring"((ev_1.event_name)::text, 1, 4) = 'EUCB'::text) THEN 'General'::text ELSE 'Premium'::text END
-> Nested Loop (cost=568.96..795.16 rows=1 width=51) (actual time=0.998..1.987 rows=61 loops=1)
-> HashAggregate (cost=568.68..568.78 rows=10 width=12) (actual time=0.942..1.018 rows=61 loops=1)
Group Key: ev_2.event_name, ev_2.event_date
-> Seq Scan on tm_evnt3 ev_2 (cost=0.00..568.63 rows=10 width=12) (actual time=0.039..0.864 rows=61 loops=1)
Filter: ("substring"((event_name)::text, 1, 4) = 'EUCB'::text)
Rows Removed by Filter: 1981
-> Index Scan using idx_tm_evnt3__event_date on tm_evnt3 ev_1 (cost=0.28..22.62 rows=1 width=43) (actual time=0.006..0.011 rows=1 loops=61)
Index Cond: (event_date = ev_2.event_date)
Filter: (("substring"((event_name)::text, 1, 4) <> 'PARK'::text) AND ("substring"((event_name)::text, 1, 5) <> 'PROMO'::text) AND ("substring"((event_name)::text, length((event_name)::text), 1) <> 'P'::text) AND (CASE WHEN ("substring"((event_name)::text, 1, 4) = 'EUCB'::text) THEN 'General'::text ELSE 'Premium'::text END = 'General'::text))
Rows Removed by Filter: 5
Planning time: 0.482 ms
Execution time: 107936.927 ms
我很清楚,问题在于第二个执行计划是用Seq Scan开始查询这里涉及的两个表中更大的一个,但我不知道为什么它没有制定相同的计划。
Postgres查询规划器是否具有确定性?有没有办法提供它应该使用的查询计划的提示?
答案 0 :(得分:1)
正如Ildar Musin评论的那样,正确的方法是确保所有数据库的统计数据都是最新的。我的理解是,这是自动发生的,但事实并非如此。
VACUUM ANALYZE能够使慢速运行查询的性能与更快的查询非常相似。