Postgres由连接表中的不同值求和

时间:2017-11-30 23:04:59

标签: sql postgresql

问题

我正在尝试为报告构建查询。我使用的是Postgres 9.6.1。下面我描述了我的架构,一些相同的数据,以及我试图实现的结果。

奇数表架构的道歉。我从AlertPost的连接表开始,基本上对于每个警报(alert_id)我需要不同用户的追随者总和。由于应用程序中的其他速度原因,user_follow_count被非规范化为Post表,这就是为什么它在此处的User表中显示的原因。

我已经尝试了大量的查询,分组,窗口和分明,但我没有得到正确的答案。

模式

假设两个表都有点大(10mm +行)并且所有外键都被索引。

表1:帖子

- id
- user_id
- user_follow_count

表2:AlertPost

- id
- alert_id (different from id, this is a join table)
- post_id

目标:对于每个alert_id,每个不同用户的user_follower_count的总和是多少?

示例数据

AlertPosts
id: 1, alert_id: 1, post_id: 1 # Same alert_id, two different post_ids
id: 2, alert_id: 1, post_id: 2
id: 3, alert_id: 2, post_id: 3
id: 4, alert_id: 2, post_id: 4


Post
id: 1, user_id: 1, user_follow_count: 3 # Same user between several posts
id: 2, user_id: 2, user_follow_count: 5
id: 3, user_id: 1, user_follow_count: 3
id: 4, user_id: 1, user_follow_count: 3

期望的结果

AlertPosts:
alert_id: 1, unique_followers: 8 # (sum of user_follow_count from user_id 1, 2)
alert_id: 2, unique_followers: 3 # (there are only posts from user_id 1)

1 个答案:

答案 0 :(得分:1)

您可以通过两个步骤解决它。首先,您必须区分alert_iduser_iduser_follow_count的组合,然后才对结果求和。

--Creating samples...
CREATE TABLE alert_posts (id, alert_id, post_id) AS
    VALUES 
        (1,1,1),
        (2,1,2),
        (3,2,3),
        (4,2,4);

CREATE TABLE post (id, user_id, user_follow_count) AS
    VALUES 
        (1,1,3),
        (2,2,5),
        (3,1,3),
        (4,1,3); 

--First step: flattening result
WITH tmp AS (
    SELECT DISTINCT 
            a.alert_id,
            --Assuming last_value to get user_follow_count of repeated users
            last_value(p.user_follow_count) OVER (
                    PARTITION BY 
                            a.alert_id, 
                            p.user_id 
                    ORDER BY p.id DESC) AS user_follow_count 
    FROM 
            alert_posts a
    JOIN post p ON p.id = a.post_id
)
--Now you can do a regular sum 
SELECT alert_id, SUM(user_follow_count) AS unique_followers FROM tmp GROUP BY alert_id;

测试here