Postgresql使用IN vs NOT IN时的巨大性能差异

时间:2018-10-04 03:30:28

标签: postgresql sql-execution-plan explain

我有2张桌子,“ transaksi”和“ buku”。 “ transaksi”大约有25万行,而buku大约有17万行。两个表的列均称为“ k999a”,并且两个表均不使用索引。现在,我检查这两个语句。

声明1:

explain select k999a from transaksi where k999a not in (select k999a from buku);

声明1的输出:

 Seq Scan on transaksi  (cost=0.00..721109017.46 rows=125426 width=9)
   Filter: (NOT (SubPlan 1))
   SubPlan 1
     ->  Materialize  (cost=0.00..5321.60 rows=171040 width=8)
           ->  Seq Scan on buku  (cost=0.00..3797.40 rows=171040 width=8)

声明2:

explain select k999a from transaksi where k999a in (select k999a from buku);

声明2的输出:

Hash Semi Join  (cost=6604.40..22664.82 rows=250853 width=9)
   Hash Cond: (transaksi.k999a = buku.k999a)
   ->  Seq Scan on transaksi  (cost=0.00..6356.53 rows=250853 width=9)
   ->  Hash  (cost=3797.40..3797.40 rows=171040 width=8)
         ->  Seq Scan on buku  (cost=0.00..3797.40 rows=171040 width=8)

为什么在NOT IN查询中,postgresql会进行循环联接,从而使查询花费很长时间?

PS:Windows 10上的Postgresql版本9.6.1

1 个答案:

答案 0 :(得分:5)

这是预料之中的。使用<?php foreach ($posts as $posts_item) {?> <div class="col-12"> <div class="single-blog-post mb-30 wow fadeInUp" data-wow-delay="300ms"> <!-- Post Thumb --> <div class="blog-post-thumb mb-30"> <img src="<?php echo $posts_item->image; ?>" alt=""> </div> <!-- Post Title --> <a href="<?php echo site_url('blog/baca/'.$posts_item->slug); ?>" class="post-title"><?php echo $posts_item->title; ?></a> <!-- Post Content--> <p class="mb-1"><?php echo word_limiter($posts_item->body, 20); ?></p> </div> <?php } ?> <div class="pagination-area wow fadeInUp" data-wow-delay="400ms"><?php echo $links; ?></div> 可能会获得更好的性能:

WHERE NOT EXISTS

对于每种方法的原因,这里有一个很好的解释:https://explainextended.com/2009/09/16/not-in-vs-not-exists-vs-left-join-is-null-postgresql/