Question

我目前有两个具有相同结构的表。表A（作为示例）具有10,000行，表B具有100,000行。我需要获取表B中不在表A中的行，但仅限于某些字段相同（并且不是）。

现在，查询类似于：

select *
from tableA A
where (A.field1, A.field2) in (select field1, field2 from tableB B)
  and A.field3 not in (select field3 from B)

这很有效，但可能使用JOIN来完成更好的性能解决方案。我试过这样做，但我得到的是一个非常庞大的重复行列表。有人能指出我正确的方向吗？

Answer 1

根据您当前的查询，这是它转换为联接的内容：

select * 
from tableA A
inner join tableB B on A.field1 = B.field1 and A.field2 = B.field2
left outer join tableB C on A.field3 = C.field3
where c.field3 is null

更快的查询将是：

    select A.pk
    from tableA A
    inner join tableB B on A.field1 = B.field1 and A.field2 = B.field2
    left outer join tableB C on A.field3 = C.field3
    where c.field3 is null
    group by A.pk

这将为您提供需要添加到tableB的行，因为找不到它们。

或者你可以得到你想要拉过的字段：

    select A.field1, A.field2, A.field3
    from tableA A
    inner join tableB B on A.field1 = B.field1 and A.field2 = B.field2
    left outer join tableB C on A.field3 = C.field3
    where c.field3 is null
    group by A.field1, A.field2, A.field3

Answer 2

[NOT] EXISTS是你的朋友：

SELECT *
FROM tableA A
WHERE EXISTS ( SELECT * FROM tableB B
    WHERE A.field1 = B.field1
    AND A.field2 = B.field2
    )
AND NOT EXISTS ( SELECT * FROM tableB B 
    WHERE  A.field3 = B.field3 
    );

注意：如果连接列不可归，则[NOT] EXISTS()版本的行为与[NOT] IN版本完全相同

再次（再次）阅读问题文本：

我需要获取表B中不在表A中的行，但仅限于某些字段相同（而且不是）。

SELECT *
FROM tableB B
WHERE EXISTS ( SELECT * FROM tableA A
        WHERE A.field1 = B.field1
        AND A.field2 = B.field2 
        AND  A.field3 <> B.field3
        );

将IN / NOT IN的PostgreSQL查询转换为JOIN

2 个答案: