Postgresql分组按共同的数组元素

时间:2017-09-29 09:19:42

标签: arrays postgresql grouping intersection

我有一张这样的表:

CREATE TABLE preferences (name varchar, preferences varchar[]);
INSERT INTO preferences (name, preferences) 
VALUES 
    ('John','{pizza, spaghetti}'), 
    ('Charlie','{spaghetti, rice}'), 
    ('Lucy','{rice, potatoes}'), 
    ('Beth','{bread, cheese}'), 
    ('Trudy','{rice, milk}');

所以从表中

John      {pizza, spaghetti}
Charlie   {spaghetti, rice}
Lucy      {rice, potatoes}
Beth      {bread, cheese}
Trudy     {rice, milk}

我想将所有具有共同元素的行分组(即使是通过其他人)。 所以在这种情况下,我想最终得到:

{John,Charlie,Lucy,Trudy}     {pizza,spaghetti,rice,potatoes,milk}
{Beth}                        {bread, cheese}

因为约翰斯的偏好与查理的偏好相交,而查理的偏好与露西和特鲁迪的那些相交。

我已经有了像这样的array_intersection函数:

CREATE OR REPLACE FUNCTION array_intersection(anyarray, anyarray)
  RETURNS anyarray
  language sql
as $FUNCTION$
    SELECT ARRAY(
        SELECT UNNEST($1)
        INTERSECT
        SELECT UNNEST($2)
    );
$FUNCTION$;

并且知道array_agg函数来聚合数组,但是如何将它们转换为我想要的分组是我缺少的步骤。

1 个答案:

答案 0 :(得分:2)

这是递归的典型任务。您需要一个辅助函数来合并和排序两个数组:

create or replace function public.array_merge(arr1 anyarray, arr2 anyarray)
    returns anyarray
    language sql immutable
as $function$
    select array_agg(distinct elem order by elem)
    from (
        select unnest(arr1) elem 
        union
        select unnest(arr2)
    ) s
$function$;

在递归查询中使用该函数:

with recursive cte(name, preferences) as (  
    select *
    from preferences
union
    select p.name, array_merge(c.preferences, p.preferences)
    from cte c
    join preferences p 
    on c.preferences && p.preferences 
    and c.name <> p.name
)
select array_agg(name) as names, preferences
from (
    select distinct on(name) *
    from cte
    order by name, cardinality(preferences) desc
    ) s
group by preferences;

           names           |             preferences              
---------------------------+--------------------------------------
 {Charlie,John,Lucy,Trudy} | {milk,pizza,potatoes,rice,spaghetti}
 {Beth}                    | {bread,cheese}
(2 rows)