我正在处理一个表现非常糟糕的查询:
SELECT COUNT(*)
FROM ps
INNER JOIN p ON p.id = ps.patient_id
INNER JOIN hh ON hh.id = ps.hh_id
INNER JOIN cma ON cma.id = ps.cma_id
INNER JOIN ter ters ON ( p.mm_id = ters.member_id )
AND ( hh.mmis_id = ters.hh_mmis_id )
AND ( cma.mmis_id = ters.cma_mmis_id )
AND ( ps.start_date = ters.begin_date )
AND ( CASE WHEN ps.oe_id = 1 THEN 'O' WHEN ps.oe_id = 2 THEN 'E' ELSE 'UNKNOWN_oe_id' END = ters.outreach_enrollment_code )
WHERE ters.status != 'Canceled' AND hh.id = 1;
并且在查询计划中我注意到排序节点(在合并连接之前)正在发出比节点接收的更多行作为输入。这真的让我的心理模型感到困惑,我错过了什么?
以下是相关查询计划的摘要:
-> Sort (cost=20956.81..21259.78 rows=121187 width=20) (actual time=140.260..3363.612 rows=29930138 loops=1)
Output: ps.p_id, ps.hh_id, ps.cma_id, ps.start_date, ps.oe_code_id, (CASE WHEN (ps.oe_code_id = 1) THEN 'O'::text WHEN (ps.oe_code_id = 2) THEN 'E'::text ELSE 'UNKNOWN_oe_code_id'::text END)
Sort Key: ps.start_date, ps.cma_id, (CASE WHEN (ps.oe_code_id = 1) THEN 'O'::text WHEN (ps.oe_code_id = 2) THEN 'E'::text ELSE 'UNKNOWN_oe_code_id'::text END)
Sort Method: quicksort Memory: 12708kB
Buffers: shared hit=4983
-> Bitmap Heap Scan on public.ps (cost=2275.62..10724.46 rows=121187 width=20) (actual time=8.833..58.231 rows=123338 loops=1)
Output: ps.p_id, ps.hh_id, ps.cma_id, ps.start_date, ps.oe_code_id, CASE WHEN (ps.oe_code_id = 1) THEN 'O'::text WHEN (ps.oe_code_id = 2) THEN 'E'::text ELSE 'UNKNOWN_oe_code_id'::text END
Recheck Cond: (ps.hh_id = 1)
Heap Blocks: exact=4644
Buffers: shared hit=4983
-> Bitmap Index Scan on index_ps_on_hh_id (cost=0.00..2245.33 rows=121187 width=0) (actual time=8.138..8.138 rows=123338 loops=1)
Index Cond: (ps.hh_id = 1)
Buffers: shared hit=339
请注意,位图堆扫描会发出123,338行,然后排序会发出29,930,138!
人们要求提供完整的查询计划:
Aggregate (cost=67207.10..67207.11 rows=1 width=0) (actual time=199297.658..199297.658 rows=1 loops=1)
Output: count(*)
Buffers: shared hit=119969133 dirtied=1
-> Nested Loop (cost=59884.61..67207.10 rows=1 width=0) (actual time=486.145..199261.336 rows=120386 loops=1)
Join Filter: (ps.p_id = p.id)
Rows Removed by Join Filter: 29809605
Buffers: shared hit=119969133 dirtied=1
-> Merge Join (cost=59884.19..62745.05 rows=8862 width=13) (actual time=486.052..19265.755 rows=29930082 loops=1)
Output: ps.p_id, ters.member_id
Merge Cond: ((ters.begin_date = ps.start_date) AND (cma.id = ps.cma_id) AND ((ters.oe_code)::text = (CASE WHEN (ps.oe_code_id = 1) THEN 'O'::text WHEN (ps.oe_code_id = 2) THEN 'E'::text ELSE 'UNKNOWN_oe_CODE_ID'::text END)))
Buffers: shared hit=11752
-> Sort (cost=38920.83..39082.15 rows=64528 width=23) (actual time=323.201..384.837 rows=130638 loops=1)
Output: hh.id, ters.member_id, ters.begin_date, ters.oe_code, cma.id
Sort Key: ters.begin_date, cma.id, ters.oe_code
Sort Method: quicksort Memory: 13279kB
Buffers: shared hit=6769
-> Hash Join (cost=3194.35..33765.80 rows=64528 width=23) (actual time=18.149..194.187 rows=130638 loops=1)
Output: hh.id, ters.member_id, ters.begin_date, ters.oe_code, cma.id
Hash Cond: ((ters.cma_mmis_id)::text = (cma.mmis_id)::text)
Buffers: shared hit=6759
-> Nested Loop (cost=3190.12..32556.05 rows=64028 width=28) (actual time=18.075..150.186 rows=130108 loops=1)
Output: hh.id, ters.member_id, ters.cma_mmis_id, ters.begin_date, ters.oe_code
Buffers: shared hit=6754
-> Seq Scan on public.hh (cost=0.00..1.12 rows=1 width=10) (actual time=0.008..0.011 rows=1 loops=1)
Output: hh.id, hh.name ... [redacted]
Filter: (hh.id = 1)
Rows Removed by Filter: 9
Buffers: shared hit=1
-> Bitmap Heap Scan on public.ters ters (cost=3190.12..31678.69 rows=87623 width=33) (actual time=18.063..124.542 rows=130108 loops=1)
Output: ters.member_id, ters.hh_mmis_id, ters.cma_mmis_id, ters.begin_date, ters.oe_code
Recheck Cond: ((ters.hh_mmis_id)::text = (hh.mmis_id)::text)
Filter: ((ters.status)::text <> 'Canceled'::text)
Rows Removed by Filter: 49848
Heap Blocks: exact=6060
Buffers: shared hit=6753
-> Bitmap Index Scan on ters_hh_mmis_id_idx (cost=0.00..3168.21 rows=138105 width=0) (actual time=16.965..16.965 rows=179956 loops=1)
Index Cond: ((ters.hh_mmis_id)::text = (hh.mmis_id)::text)
Buffers: shared hit=693
-> Hash (cost=2.99..2.99 rows=99 width=12) (actual time=0.052..0.052 rows=99 loops=1)
Output: cma.id, cma.mmis_id
Buckets: 1024 Batches: 1 Memory Usage: 5kB
Buffers: shared hit=2
-> Seq Scan on public.cma (cost=0.00..2.99 rows=99 width=12) (actual time=0.006..0.030 rows=99 loops=1)
Output: cma.id, cma.mmis_id
Buffers: shared hit=2
-> Sort (cost=20956.81..21259.78 rows=121187 width=20) (actual time=162.834..3317.995 rows=29930138 loops=1)
Output: ps.p_id, ps.hh_id, ps.cma_id, ps.start_date, ps.oe_code_id, (CASE WHEN (ps.oe_code_id = 1) THEN 'O'::text WHEN (ps.oe_code_id = 2) THEN 'E'::text ELSE 'UNKNOWN_oe_CODE_ID'::text END)
Sort Key: ps.start_date, ps.cma_id, (CASE WHEN (ps.oe_code_id = 1) THEN 'O'::text WHEN (ps.oe_code_id = 2) THEN 'E'::text ELSE 'UNKNOWN_oe_CODE_ID'::text END)
Sort Method: quicksort Memory: 12708kB
Buffers: shared hit=4983
-> Bitmap Heap Scan on public.ps (cost=2275.62..10724.46 rows=121187 width=20) (actual time=9.940..72.463 rows=123338 loops=1)
Output: ps.p_id, ps.hh_id, ps.cma_id, ps.start_date, ps.oe_code_id, CASE WHEN (ps.oe_code_id = 1) THEN 'O'::text WHEN (ps.oe_code_id = 2) THEN 'E'::text ELSE 'UNKNOWN_oe_CODE_ID'::text END
Recheck Cond: (ps.hh_id = 1)
Heap Blocks: exact=4644
Buffers: shared hit=4983
-> Bitmap Index Scan on index_ps_on_hh_id (cost=0.00..2245.33 rows=121187 width=0) (actual time=9.226..9.226 rows=123338 loops=1)
Index Cond: (ps.hh_id = 1)
Buffers: shared hit=339
-> Index Scan using index_p_on_mm_id on public.p (cost=0.42..0.49 rows=1 width=12) (actual time=0.005..0.006 rows=1 loops=29930082)
Output: p.id, p.mm_id
Index Cond: ((p.mm_id)::text = (ters.member_id)::text)
Buffers: shared hit=119957381 dirtied=1
Planning time: 5.952 ms
Execution time: 199299.305 ms
答案 0 :(得分:0)
尝试在CASE
子句
ON
语句的情况下重构它
SELECT COUNT(*)
FROM ps
INNER JOIN p ON p.id = ps.patient_id
INNER JOIN hh ON hh.id = ps.hh_id
INNER JOIN cma ON cma.id = ps.cma_id
INNER JOIN ter ters ON ( p.mm_id = ters.member_id )
AND ( hh.mmis_id = ters.hh_mmis_id )
AND ( cma.mmis_id = ters.cma_mmis_id )
AND ( ps.start_date = ters.begin_date )
AND ( (ps.oe_id = 1 AND ters.outreach_enrollment_code = 'O')
OR (ps.oe_id = 2 AND ters.outreach_enrollment_code = 'E')
OR (ps.oe_id NOT IN (1,2) AND ters.outreach_enrollment_code = 'UNKNOWN_oe_id'))
WHERE ters.status != 'Canceled' AND hh.id = 1;
如果确保有关于这些表的最新统计信息以及ps.oe_id上的索引,它也将有助于提高性能。