Question

myschema中的两个表：发布和评论。评论的onId指的是帖子的。在以下两段postgres代码中，哪个更快？

DELETE FROM myschema.comment comment
  WHERE NOT EXISTS (
    SELECT NULL
    FROM myschema.post post
    WHERE post.id = comment."onId"
  );


DELETE FROM myschema.comment
USING myschema.comment AS newcomment LEFT JOIN myschema.post AS newpost
ON newpost.id = newcomment."onId"
WHERE myschema.comment.id = newcomment.id AND newpost.id is NULL;

由于

更新

如果id和onId已编入索引，有任何差异吗？

Answer 1

选项1更快。

在选项2中，数据库必须执行整个连接，而选项1将使用否定的半连接执行。

我相信现在所有主要的RDBMS都是如此，而不仅仅是postgresql。

Answer 2

t=# drop table a,b;
DROP TABLE
t=# create table a(i int);
CREATE TABLE
t=# create table b(i int);
CREATE TABLE
t=# insert into a select * from generate_series(1,10000,1);
INSERT 0 10000
t=# insert into b select * from generate_series(1000,11000,1);
INSERT 0 10001

并计划：

t=# begin; explain analyze delete from b where not exists (select null from a where a.i = b.i); rollback;
BEGIN
                                                       QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
 Delete on b  (cost=270.00..565.02 rows=1 width=12) (actual time=10.516..10.516 rows=0 loops=1)
   ->  Hash Anti Join  (cost=270.00..565.02 rows=1 width=12) (actual time=9.154..9.927 rows=1000 loops=1)
         Hash Cond: (b.i = a.i)
         ->  Seq Scan on b  (cost=0.00..145.01 rows=10001 width=10) (actual time=0.008..2.014 rows=10001 loops=1)
         ->  Hash  (cost=145.00..145.00 rows=10000 width=10) (actual time=4.554..4.554 rows=10000 loops=1)
               Buckets: 1024  Batches: 1  Memory Usage: 430kB
               ->  Seq Scan on a  (cost=0.00..145.00 rows=10000 width=10) (actual time=0.005..2.107 rows=10000 loops=1)
 Total runtime: 10.589 ms
(8 rows)

ROLLBACK
t=# begin; explain analyze delete from b using b bj left outer join a on bj.i = a.i where a.i is null and b.i = bj.i; rollback;
BEGIN
                                                             QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------
 Delete on b  (cost=565.04..747.56 rows=1 width=18) (actual time=14.684..14.684 rows=0 loops=1)
   ->  Hash Join  (cost=565.04..747.56 rows=1 width=18) (actual time=13.482..14.119 rows=1000 loops=1)
         Hash Cond: (b.i = bj.i)
         ->  Seq Scan on b  (cost=0.00..145.01 rows=10001 width=10) (actual time=0.010..1.928 rows=10001 loops=1)
         ->  Hash  (cost=565.02..565.02 rows=1 width=16) (actual time=10.224..10.224 rows=1000 loops=1)
               Buckets: 1024  Batches: 1  Memory Usage: 43kB
               ->  Hash Anti Join  (cost=270.00..565.02 rows=1 width=16) (actual time=9.178..9.969 rows=1000 loops=1)
                     Hash Cond: (bj.i = a.i)
                     ->  Seq Scan on b bj  (cost=0.00..145.01 rows=10001 width=10) (actual time=0.003..2.098 rows=10001 loops=1)
                     ->  Hash  (cost=145.00..145.00 rows=10000 width=10) (actual time=4.570..4.570 rows=10000 loops=1)
                           Buckets: 1024  Batches: 1  Memory Usage: 430kB
                           ->  Seq Scan on a  (cost=0.00..145.00 rows=10000 width=10) (actual time=0.005..2.088 rows=10000 loops=1)
 Total runtime: 14.775 ms
(13 rows)

ROLLBACK

首先看起来更快更便宜

postgres：删除数据的速度更快，不存在于另一个表中

2 个答案: