用group by,2个具有和where子句连接4个表

时间:2018-12-09 23:00:18

标签: postgresql join group-by having having-clause

我的数据库包含4个表:

  1. 用户(id,“名称”,姓氏,生日)
  2. friendships(userid1,userid2,“ timestamp”)
  3. 帖子(id,userid,“ text”,“ timestamp”)
  4. 喜欢(postid,userid,“时间戳”)

我需要获得一组唯一的用户名结果集,这些用户名在2018年1月内具有3个以上的友谊,并且每个“帖子”的“喜欢”平均数在[10; 35)。

我第一步写了这个声明:

select  distinct u."name"
from users u
join friendships f on u.id = f.userid1
where f."timestamp" between '2018-01-01'::timestamp and '2018-01-31'::timestamp
group by u.id
having count(f.userid1) > 3;

它工作正常,并返回3行。但是当我以这种方式添加第二部分时:

select  distinct u."name"
from users u
join friendships f on u.id = f.userid1
join posts p on p.userid = u.id
join likes l on p.id = l.postid
where f."timestamp" between '2018-01-01'::timestamp and '2018-01-31'::timestamp
group by u.id
having count(f.userid1) > 3 
    and ((count(l.postid) / count(distinct l.postid)) >= 10 
        and (count(l.postid) / count(distinct l.postid)) < 35);

我要疯了94行。我不知道为什么 感谢您的帮助。

2 个答案:

答案 0 :(得分:1)

distinct中不需要u.name,因为聚合将删除重复项。

select
   u."name"
from 
   users u
   inner join friendships f on u.id = f.userid1
   inner join posts p on u.id = p.userid
   inner join likes l on p.id = l.postid
where 
   f."timestamp" >= '2018-01-01'::timestamp 
   and f."timestamp" < '2018-02-01'::timestamp
group by 
    u."name"
having 
    count(distinct f.userid1) > 3 
    and ((count(l.postid) / count(distinct l.postid)) >= 10 
            and (count(l.postid) / count(distinct l.postid)) < 35);

如评论中所述。当您对between使用date进行范围调整时,这不是一个好主意。

f."timestamp" >= '2018-01-01'::timestamp 
and f."timestamp" < '2018-02-01'::timestamp

将给您一整个月的时间。

答案 1 :(得分:0)

尝试以下方法!使用“ count(f.userid1)> 3”的问题是,如果用户拥有,例如2个朋友,6个帖子和3个喜欢,他们将获得2 x 6 = 12行,因此12条记录的非null f.userid1。通过计算不同的f.userid2,您可以计算不同的朋友。用于过滤的其他计数也会出现类似的问题。

select  u."name"
from users u
join friendships f on u.id = f.userid1
join posts p on p.userid = u.id
left join likes l on p.id = l.postid
where f."timestamp" > '2018-01-01'::timestamp and f."timestamp" < '2018-02-01'::timestamp
group by u.id, u."name"
having
 --at least three distinct friends
 count( distinct f.userid2) > 3 
  --distinct likes / distinct posts
  --we use l.* to count distinct likes since there's no primary key
  and ((count(distinct l.*) / count(distinct p.id)) >= 10 
        and ((count(distinct l.*) / count(distinct p.id)) < 35);