我们有一个工具,允许用户创建自己的组。在这些组中,用户可以撰写帖子。我想要确定的是该组的大小与该组中的帖子总数之间的关系。
我可以使用SQL语句来获取组名列表和该组中的用户数(查询1)以及组名列表和帖子数(查询2)但我希望两者都是在同一个查询中。
查询1
select count(pg.personID) as GroupSize, g.GroupName
from Group g inner join PersonGroup pg g.GroupID = pg.GroupID
where LastViewed between @startDate and @enddate and
g.Type = 0
group by g.GroupID, g.GroupName
order by GroupSize
查询2
select count(gp.PostID) as TotalPosts, g.GroupName
from Group g inner join GroupPost gp on g.GroupID = gp.GroupID
inner join Post p on gp.PostID = p.PostID
where g.Type = 0 and
gp.Created between @startDate and @enddate
group by g.GroupID, g.GroupName
order by TotalPosts
**注意:一个人可以将相同的“帖子”发布到多个组
我相信,根据这些数据,我可以建立一个直方图(10-20个用户的组数,21-30个用户等),并在这些不同的垃圾箱中合并组的平均帖子数。
答案 0 :(得分:2)
一个简单的解决方案是将这些查询用作Sub查询,并将它们组合起来:
SELECT
grps.GroupName,
grps.GroupSize,
psts.TotalPosts
FROM (
select count(pg.personID) as GroupSize, g.GroupName, g.GroupID
from Group g inner join PersonGroup pg g.GroupID = pg.GroupID
where LastViewed between @startDate and @enddate and
g.Type = 0
group by g.GroupID, g.GroupName
order by GroupSize) grps
JOIN (
select count(gp.PostID) as TotalPosts, g.GroupName, g.groupID
from Group g inner join GroupPost gp on g.GroupID = gp.GroupID
inner join Post p on gp.PostID = p.PostID
where g.Type = 0 and
gp.Created between @startDate and @enddate
group by g.GroupID, g.GroupName
order by TotalPosts) psts
ON psts.GroupID = grps.GroupID
答案 1 :(得分:0)
Paul的解决方案假设两组(按帖子和用户)是相同的。这可能不是真的,因此需要完全外连接或联合。
我的偏好如下:
with groups as
(
select *
from Group g
where g.Type = 0
and g.LastViewed between @startDate and @enddate
)
select GroupId, GroupName, SUM(GroupSize) as GroupSize, SUM(TotalPosts) as TotalPosts)
from
(
(select groups.GroupId, groups.GroupName, 1 as GroupSize, 0 as TotalPosts
from groups
join PersonGroup pg
on pg.GroupId = groups.groupId
)
union all
(select groups.GroupId, groups.GroupName, 0 as GroupSize, 1 as TotalPosts
from groups
join GroupPost gp
on groups.GroupId = gp.GroupId
join Post p
on gp.PostId = p.PostId
)
)
group by GroupId, GroupName
“with”子句定义您正在使用的组的集合。这将定义放在一个地方,很明显两个子查询具有相同的过滤。这两个子查询只有标志,指示两个变量中的每一个,然后在更高级别汇总。有时在子查询中进行聚合也更有效,特别是在有索引时。