postgresql:通过数组自行加入

时间:2018-12-10 12:26:16

标签: postgresql self-join array-agg

我的问题是关于针对以下用例形成Postgres SQL查询

方法#1

我有一个如下表,其中我在不同类型(a,b,c,d)上生成相同的uuid,例如映射不同类型。

+----+------+-------------+
| id | type | master_guid |
+----+------+-------------+
|  1 | a    | uuid-1      |
|  2 | a    | uuid-2      |
|  3 | a    | uuid-3      |
|  4 | a    | uuid-4      |
|  5 | a    | uuid-5      |
|  6 | b    | uuid-1      |
|  7 | b    | uuid-2      |
|  8 | b    | uuid-3      |
|  9 | b    | uuid-6      |
| 10 | c    | uuid-1      |
| 11 | c    | uuid-2      |
| 12 | c    | uuid-3      |
| 13 | c    | uuid-6      |
| 14 | c    | uuid-7      |
| 15 | d    | uuid-6      |
| 16 | d    | uuid-2      |
+----+------+-------------+

方法#2

我创建了两个表,分别输入id和id到master_guid,如下所示

表1:

+----+------+
| id | type |
+----+------+
|  1 | a    |
|  2 | a    |
|  3 | a    |
|  4 | a    |
|  5 | a    |
|  6 | b    |
|  7 | b    |
|  8 | b    |
|  9 | b    |
| 10 | c    |
| 11 | c    |
| 12 | c    |
| 13 | c    |
| 14 | c    |
| 15 | d    |
| 16 | d    |
+----+------+

表2

+----+-------------+
| id | master_guid |
+----+-------------+
|  1 | uuid-1      |
|  2 | uuid-2      |
|  3 | uuid-3      |
|  4 | uuid-4      |
|  5 | uuid-5      |
|  6 | uuid-1      |
|  7 | uuid-2      |
|  8 | uuid-3      |
|  9 | uuid-6      |
| 10 | uuid-1      |
| 11 | uuid-2      |
| 12 | uuid-3      |
| 13 | uuid-6      |
| 14 | uuid-7      |
| 15 | uuid-6      |
| 16 | uuid-2      |
+----+-------------+

我想通过两种方法获得如下输出:

+----+------+--------+------------+
| id | type |  uuid  | mapped_ids |
+----+------+--------+------------+
|  1 | a    | uuid-1 | [6,10]     |
|  2 | a    | uuid-2 | [7,11]     |
|  3 | a    | uuid-3 | [8,12]     |
|  4 | a    | uuid-4 | null       |
|  5 | a    | uuid-5 | null       |
+----+------+--------+------------+

我已经尝试对id和基于uuid的分组进行array_agg的自联接,但无法获得所需的输出。

使用以下查询填充数据:

方法#1

insert into table1 values 
(1,'a','uuid-1'),
(2,'a','uuid-2'),
(3,'a','uuid-3'),
(4,'a','uuid-4'),
(5,'a','uuid-5'),
(6,'b','uuid-1'),
(7,'b','uuid-2'),
(8,'b','uuid-3'),
(9,'b','uuid-6'),
(10,'c','uuid-1'),
(11,'c','uuid-2'),
(12,'c','uuid-3'),
(13,'c','uuid-6'),
(14,'c','uuid-7'),
(15,'d','uuid-6'),
(16,'d','uuid-2')

方法#2

insert into table1 values 
(1,'a'),
(2,'a'),
(3,'a'),
(4,'a'),
(5,'a'),
(6,'b'),
(7,'b'),
(8,'b'),
(9,'b'),
(10,'c'),
(11,'c'),
(12,'c'),
(13,'c'),
(14,'c'),
(15,'d'),
(16,'d')

insert into table2 values 
(1,'uuid-1'),
(2,'uuid-2'),
(3,'uuid-3'),
(4,'uuid-4'),
(5,'uuid-5'),
(6,'uuid-1'),
(7,'uuid-2'),
(8,'uuid-3'),
(9,'uuid-6'),
(10,'uuid-1'),
(11,'uuid-2'),
(12,'uuid-3'),
(13,'uuid-6'),
(14,'uuid-7'),
(15,'uuid-6'),
(16,'uuid-2')

2 个答案:

答案 0 :(得分:1)

demo: db<>fiddle

使用window function ARRAY_AGG可让您按组汇总id(在您的情况下,组就是您的uuid s)

SELECT 
    id, type, master_guid as uuid, 
    array_agg(id) OVER (PARTITION BY master_guid) as mapped_ids
FROM table1
ORDER BY id

结果:

| id | type |   uuid | mapped_ids |
|----|------|--------|------------|
|  1 |    a | uuid-1 |     10,6,1 |
|  2 |    a | uuid-2 |  16,2,7,11 |
|  3 |    a | uuid-3 |     8,3,12 |
|  4 |    a | uuid-4 |          4 |
|  5 |    a | uuid-5 |          5 |
|  6 |    b | uuid-1 |     10,6,1 |
|  7 |    b | uuid-2 |  16,2,7,11 |
|  8 |    b | uuid-3 |     8,3,12 |
|  9 |    b | uuid-6 |    15,13,9 |
| 10 |    c | uuid-1 |     10,6,1 |
| 11 |    c | uuid-2 |  16,2,7,11 |
| 12 |    c | uuid-3 |     8,3,12 |
| 13 |    c | uuid-6 |    15,13,9 |
| 14 |    c | uuid-7 |         14 |
| 15 |    d | uuid-6 |    15,13,9 |
| 16 |    d | uuid-2 |  16,2,7,11 |

这些数组当前还包含当前行的ID(mapped_ids的{​​{1}}包含id = 1)。这可以通过使用1删除此元素来纠正:

array_remove

结果:

SELECT 
    id, type, master_guid as uuid,  
    array_remove(array_agg(id) OVER (PARTITION BY master_guid), id) as mapped_ids
FROM table1
ORDER BY id

现在,例如| id | type | uuid | mapped_ids | |----|------|--------|------------| | 1 | a | uuid-1 | 10,6 | | 2 | a | uuid-2 | 16,7,11 | | 3 | a | uuid-3 | 8,12 | | 4 | a | uuid-4 | | | 5 | a | uuid-5 | | | 6 | b | uuid-1 | 10,1 | | 7 | b | uuid-2 | 16,2,11 | | 8 | b | uuid-3 | 3,12 | | 9 | b | uuid-6 | 15,13 | | 10 | c | uuid-1 | 6,1 | | 11 | c | uuid-2 | 16,2,7 | | 12 | c | uuid-3 | 8,3 | | 13 | c | uuid-6 | 15,9 | | 14 | c | uuid-7 | | | 15 | d | uuid-6 | 13,9 | | 16 | d | uuid-2 | 2,7,11 | 包含一个空数组,而不是id=4值。这可以通过使用NULL函数来实现。如果两个参数相等,则给出NULLIF,否则给出第一个参数。

NULL

结果:

SELECT 
    id, type, master_guid as uuid,  
    NULLIF(
        array_remove(array_agg(id) OVER (PARTITION BY master_guid), id), 
        '{}'::int[]
    ) as mapped_ids 
FROM table1
ORDER BY id

答案 1 :(得分:0)

尝试一下:

Over 0

我没有得出与您列出的结果完全相同的结果,但我认为这很接近,可能您的期望值有误,或者我的期望值仅是一个小错误……点。

-编辑-

对于方法2,我认为您只需向Table2添加内部联接即可获得GUID:

select
  t1.id, t1.type, t1.master_guid, array_agg (distinct t2.id)
from
  table1 t1
  left join table1 t2 on
    t1.master_guid = t2.master_guid and
    t1.id != t2.id
group by
  t1.id, t1.type, t1.master_guid