Postgres:分组并获得最频繁的值,以及匹配的外键计数

时间:2019-04-01 15:02:24

标签: postgresql

我正在使用Postgres 9.6。我有一个名为person的表,看起来像这样:

 id         | integer (pk)        
 name       | character varying(300) 
 name_slug  | character varying(50)

另一个名为person_to_manor的表如下所示,其中person_idperson.id的外键:

 id        | integer (pk)
 manor_id  | integer
 person_id | integer 

我想结合这两个表来填充第三个表canonical_person,其中主键为name_slug,并具有以下字段:

 name_slug  | character varying(50) (pk)
 name       | character varying(300) 
 num_manor  | integer     

其中:

  • name_slug是主键
  • nameperson.name分组时name_slug的最常用值
  • num_l66person_to_manor中与id的值相匹配的name_slug的任何值的行数。

在单个SQL查询中有可能吗?据我所知...

INSERT INTO canonical_person
VALUES (
  SELECT name_slug,
  [most popular value of name from `array_agg(distinct name) from person`],
  COUNT(number of rows in person_to_manor that match any of `array_agg(distinct id) from person`)
  FROM person
  GROUP BY name_slug);

1 个答案:

答案 0 :(得分:0)

是这样吗?

我创建了三个表

CREATE TABLE test.person (
id int4 NOT NULL,
"name" varchar(300) NULL,
name_slug varchar(50) NULL,
CONSTRAINT person_pkey PRIMARY KEY (id)
);


CREATE TABLE test.person_to_manor (
id int4 NOT NULL,
manor_id int4 NULL,
person_id int4 NULL,
CONSTRAINT person_to_manor_pkey PRIMARY KEY (id),
CONSTRAINT person_to_manor_person_id_fkey FOREIGN KEY (person_id) REFERENCES 
test.person(id)
);


CREATE TABLE test.canonical_person (
name_slug varchar(50) NOT NULL,
"name" varchar(300) NULL,
num_manor int4 NULL,
CONSTRAINT canonical_person_pkey PRIMARY KEY (name_slug)
);

具有以下值

select * from test.person;

id|name|name_slug
--|----|---------
 0|a   |ab       
 1|b   |aa       
 2|c   |ab       
 3|a   |bb       
 4|a   |ab       


select * from test.person_to_manor;

id|manor_id|person_id
--|--------|---------
 1|       5|        0
 2|       6|        0
 3|       7|        2

我运行此查询

insert into  test.canonical_person
select name_slug,
       name as most_popular_name,
       sub.n as count_rows
from (
        select name,
               name_slug,count(*) as n,  
               row_number () over(order by count(*) desc)  as n_max
        from test.person 
        group by name,name_slug 
        order by n_max asc
     ) as sub
where sub.n_max =1;

查询后的结果

select * from test.canonical_person;

name_slug|name|num_manor
---------|----|---------
ab       |a   |        2

这是您的目标吗?