当我运行带有横向联接和内部为LIMIT
的查询时,它使用嵌套的循环联接。但是,当我删除LIMIT
时,它将使用哈希右联接。为什么?
EXPLAIN ANALYSE
SELECT proxy.*
FROM jobs
LEFT OUTER JOIN LATERAL (
SELECT proxy.*
FROM proxy
WHERE jobs.id = proxy.job_id
) proxy ON true
Hash Right Join (cost=2075.47..3029.05 rows=34688 width=12) (actual time=9.951..24.758 rows=35212 loops=1)
Hash Cond: (proxy.job_id = jobs.id)
-> Seq Scan on proxy (cost=0.00..524.15 rows=34015 width=12) (actual time=0.011..2.502 rows=34028 loops=1)
-> Hash (cost=1641.87..1641.87 rows=34688 width=4) (actual time=9.842..9.842 rows=34689 loops=1)
Buckets: 65536 Batches: 1 Memory Usage: 1732kB
-> Index Only Scan using jobs_pkey on jobs (cost=0.29..1641.87 rows=34688 width=4) (actual time=0.010..4.904 rows=34689 loops=1)
Heap Fetches: 921
但是当我向查询添加限制时,实际时间从24跳到150:
EXPLAIN ANALYSE
SELECT proxy.*
FROM jobs
LEFT OUTER JOIN LATERAL (
SELECT proxy.*
FROM proxy
WHERE jobs.id = proxy.job_id
limit 1
) proxy ON true
Nested Loop Left Join (cost=0.58..290506.19 rows=34688 width=12) (actual time=0.024..155.753 rows=34689 loops=1)
-> Index Only Scan using jobs_pkey on jobs (cost=0.29..1641.87 rows=34688 width=4) (actual time=0.014..3.984 rows=34689 loops=1)
Heap Fetches: 921
-> Limit (cost=0.29..8.31 rows=1 width=12) (actual time=0.001..0.001 rows=1 loops=34689)
-> Index Scan using index_job_proxy_on_job_id on loc_job_source_materials (cost=0.29..8.31 rows=1 width=12) (actual time=0.001..0.001 rows=1 loops=34689)
Index Cond: (jobs.id = job_id)
答案 0 :(得分:2)
优化器足够聪明,可以将您的第一个查询重写为
SELECT proxy.*
FROM proxy
RIGHT OUTER JOIN jobs
ON jobs.id = proxy.job_id;
但是无法使用LIMIT
子句进行此优化,因此只能进行嵌套循环联接。
答案 1 :(得分:0)
根据@LaurenzAlbe的回答,如果您显示完整的查询,我认为我们可以提供更多帮助,因此我们知道您为什么需要LATERAL联接。对于到目前为止您提到的(简化的)要求,我认为等同于
SELECT DISTINCT ON(proxy.id) proxy.*
FROM proxy
RIGHT OUTER JOIN jobs
ON jobs.id = proxy.job_id;
此外,由于您仅从proxy
输出列,因此您实际上仅在执行INNER JOIN,但是需要进行更多的计算。