假设我的数据包括学生的SSN,他们就读的大学校园以及某一年的工资。像这样......
create table #thetable (SSN int, campus int, wage int);
insert into #thetable(SSN, campus, wage)
values
(111111111,1,100),
(111111111,2,100),
(222222222,1,250),
(222222222,2,250),
(333333333,1,50),
(444444444,2,400);
现在,我希望得到每个校区学生的平均工资,以及来自所有校区的学生的平均工资......所以我做了这样的事情:
select campus, avg(wage)
from #thetable
group by cube(campus);
问题在于,当我将校园分组在一起时,我不想对参加两个校区的学生进行重复计算。这是我得到的输出(双计数学生111111111和2222222222):
Campus (no column name)
1 133
2 250
NULL 191
我想要的输出是这个(没有重复计算):
Campus (no column name)
1 133
2 250
NULL 200
这可以在不使用多个查询和UNION
运算符的情况下完成吗?如果是这样,怎么样? (顺便说一句,我意识到这个表没有规范化......会规范化帮助吗?)
答案 0 :(得分:1)
您无法使用一列执行此操作。 cube
将根据每行的计算汇总值。因此,如果一个计算中包含一行,它将包含在总和中。
但是,您可以通过将值加1除以频率来实现此目的。这将学生在校园内“平分”为每个学生增加1:
select campus, avg(wage) as avg_wage, sum(wage*weight) / sum(weight) avg_wage_weighted
from (select t.*, (1.0 / count(*) over (partition by SSN)) as weight
from #thetable t
) t
group by cube(campus);
第二列应该是您想要的值。然后,您可以将其进一步嵌入子查询中以获得一列:
select campus, (case when campus is null then avg_wage_weighted else avg_wage end)
from (select campus, avg(wage) as avg_wage, sum(wage*weight) / sum(weight) avg_wage_weighted
from (select t.*, (1.0 / count(*) over (partition by SSN)) as weight
from #thetable t
) t
group by cube(campus)
) t
Here是一个显示解决方案的SQL小提琴。
答案 1 :(得分:0)
用相关的子查询计算出来。适合我。
select campus,
(
select avg(wage)
from
(
select ssn, campus, wage, row_number() over(partition by SSN order by wage) as RN
from #thetable as inside
where (inside.campus=outside.campus or outside.campus is null)
) as middle
where RN=1
)
from #thetable outside
group by cube(campus);