如何计算用户和多个表之间的连接数?

时间:2015-12-03 21:12:25

标签: postgresql

我正在使用postgresql 9.3.9,并且有一个表users,如下所示:

user_id   | email
----------------------------
1001      | hello@world.com
1030      | mel@hotmail.com
2333      | jess@gmail.com
2502      | peter@gmail.com
3000      | olivia@hotmail.com
4000      | sharon@gmail.com
4900      | lisa@gmail.com

然后我有几个表列出了用户在各种平台上连接的内容以及连接时间。即platform_aplatform_bplatform_c

platform_a可能如下所示:

user_id | created_at
----------------------------
1001    | 2015-04-30
2333    | 2015-05-15
3000    | 2014-02-15

platform_b可能如下所示:

user_id | created_at
----------------------------
1001    | 2015-06-30
2333    | 2015-07-02
4900    | 2015-07-03

platform_c可能如下所示:

user_id | created_at
----------------------------
1001    | 2015-08-16
1030    | 2015-07-03
3000    | 2015-09-01 
4000    | 2015-09-01

我希望最终结果如下:

user_id | # of connections | latest created_at  | connected to a | connected to b | connected to c
--------------------------------------------------------------------------------------------------
1001    | 3                | 2015-08-16         | yes            | yes            | yes
1030    | 1                | 2015-07-03         | no             | no             | yes
2333    | 2                | 2015-07-02         | yes            | yes            | no
2502    | 0                |                    | no             | no             | no
3000    | 2                | 2015-09-01         | yes            | no             | yes
4000    | 1                | 2015-09-01         | no             | no             | yes
4900    | 1                | 2015-07-03         | no             | yes            | no            

我该怎么做?

3 个答案:

答案 0 :(得分:4)

首先,与所有表格建立联盟:

SELECT user_id, created_at, 1 AS a, 0 AS b, 0 AS c FROM tableA
UNION 
SELECT user_id, created_at, 0 AS a, 1 AS b, 0 AS c FROM tableB
UNION 
SELECT user_id, created_at, 0 AS a, 0 AS b, 1 AS c FROM tableC

然后将此子查询的结果分组

SELECT user_id, COUNT(user_id), MAX(created_at), MAX(a), MAX(b), MAX(c)
FROM subquery_above
GROUP BY user_id

这不会给你零结果,但你可以通过用户列表上的LEFT JOIN来实现。

答案 1 :(得分:3)

select 
    user_id, 
    count(p), 
    max(created_at),
    coalesce(sum((pl = 'a')::int), 0) connected_to_a,
    coalesce(sum((pl = 'b')::int), 0) connected_to_b,
    coalesce(sum((pl = 'c')::int), 0) connected_to_c
from users u
left join (
    select *, 'a' pl from platform_a
    union all
    select *, 'b' pl from platform_b
    union all
    select *, 'c' pl from platform_c
    ) p
using (user_id)
group by 1;

 user_id | count |    max     | connected_to_a | connected_to_b | connected_to_c 
---------+-------+------------+----------------+----------------+----------------
    1001 |     3 | 2015-08-16 |              1 |              1 |              1
    1030 |     1 | 2015-07-03 |              0 |              0 |              1
    2333 |     2 | 2015-07-02 |              1 |              1 |              0
    2502 |     0 |            |              0 |              0 |              0
    3000 |     2 | 2015-09-01 |              1 |              0 |              1
    4000 |     1 | 2015-09-01 |              0 |              0 |              1
    4900 |     1 | 2015-07-03 |              0 |              1 |              0
(7 rows)

答案 2 :(得分:1)

当您检查所有用户时,在加入之前通常最快聚合:

connections

结果完全符合要求 - 除了SELECT为NULL而不是' 0'在你的例子中。如果您需要转换,请在外部SELECT *中使用COALESCE()。我没有,因为SELECT非常方便 如果您要列出外部users中的所有列,您也可以使用u代替子查询{{1}}来剪切其他列。

bool_or()是这项工作的完美集合函数。

一个平台可能有多个链接。此查询仍会为每个用户返回一行。