Question

我有这样的数据结构：

id | array
-------------
 1 | {1,2,3,4}
 2 | {1,2,4,5}
 3 | {1,5,6,7}
 4 | {4,5,6,8}

每个数组具有相同数量的元素。我想知道每两个行的组合中相同位置的元素的数量。因此，例如，{1,2,3,4}和{2,3,4,5}共有零个元素（因为它们不在同一位置），而{1,2,3,4}和{1,2,3,5}将具有三个共同的元素。因此，上面的结果将类似于：

row_ids | common
----------------
{1,2}   |      2
{1,3}   |      1
{1,4}   |      0
{2,3}   |      1
{2,4}   |      0
{3,4}   |      2

在Postgres中有可能吗，还是我需要将所有数据导入Python并在其中存储在内存中？

Answer 1

使用方便的选项with ordinality，可以在公共表表达式中嵌套数组，同时跟踪每个元素的索引。

然后，您可以自连接结果集，对位置相等的元素进行聚合和计数。

这种方法的一个优点是它可以正确处理不同大小的数组。

with cte as (
    select t.id, a.elem, a.nr
    from mytable t
    cross join lateral unnest(t.ar) with ordinality as a(elem, nr)
)
select 
    array[c1.id, c2.id] row_ids, 
    sum( (c1.nr = c2.nr and c1.elem = c2.elem)::int ) common
from cte c1
inner join cte c2 on c1.id < c2.id
group by c1.id, c2.id
order by c1.id, c2.id

Demo on DB Fiddle ：

row_ids | common
:------ | -----:
{1,2}   |      2
{1,3}   |      1
{1,4}   |      0
{2,3}   |      1
{2,4}   |      0
{3,4}   |      2

确定Postgres中所有数组之间的公共元素数量

1 个答案: