在没有联接的情况下查找与其他表中至少n行相关的行

时间:2019-06-04 09:48:27

标签: postgresql self-join

我有一个这样的表(tbl):

+----+------+-----+
| pk | attr | val |
+----+------+-----+
|  0 | ohif |   4 |
|  1 | foha |  56 |
|  2 | slns |   2 |
|  3 | faso |  11 |
+----+------+-----+

另一个与tbltbl2)成n对1关系的表:

+----+-----+
| pk | rel |
+----+-----+
|  0 |   0 |
|  1 |   1 |
|  2 |   0 |
|  3 |   2 |
|  4 |   2 |
|  5 |   3 |
|  6 |   1 |
|  7 |   2 |
+----+-----+

tbl2.rel-> tbl.pk。)

我只想选择tbl中与至少n中的tbl2行相关的行。

即,对于n = 2,我需要此表:

+----+------+-----+
| pk | attr | val |
+----+------+-----+
|  0 | ohif |   4 |
|  1 | foha |  56 |
|  2 | slns |   2 |
+----+------+-----+

这是我想出的解决方案:

SELECT DISTINCT ON (tbl.pk) tbl.*
FROM (
    SELECT tbl.pk
    FROM tbl
    RIGHT OUTER JOIN tbl2 ON tbl2.rel = tbl.pk
    GROUP BY tbl.pk
    HAVING COUNT(tbl2.*) >= 2  -- n
) AS tbl_candidates
LEFT OUTER JOIN tbl ON tbl_candidates.pk = tbl.pk

是否可以不通过子查询选择候选者并将其自身重新联接到表中?

我正在使用Postgres10。标准的SQL解决方案会更好,但是Postgres解决方案是可以接受的。

1 个答案:

答案 0 :(得分:0)

确定,只需加入一次,如下:

select
    t1.pk,
    t1.attr,
    t1.val
from
    tbl t1
join
    tbl2 t2 on t1.pk = t2.rel
group by
    t1.pk,
    t1.attr,
    t1.val
having(count(1)>=2) order by t1.pk;
 pk | attr | val 
----+------+-----
  0 | ohif |   4
  1 | foha |  56
  2 | slns |   2
(3 rows)

或者只需加入一次并使用CTE(with clause),如下所示:

with tmp as (
select rel from tbl2 group by rel having(count(1)>=2)
)
select b.* from tmp t join tbl b on t.rel = b.pk order by b.pk;
 pk | attr | val 
----+------+-----
  0 | ohif |   4
  1 | foha |  56
  2 | slns |   2
(3 rows)

SQL更清晰了吗?