使用多维数据集保持独特的行数

时间:2014-01-15 19:05:49

标签: sql sql-server tsql

假设我的数据包括学生的SSN,他们就读的大学校园以及某一年的工资。像这样......

create table #thetable (SSN int, campus int, wage int);

insert into #thetable(SSN, campus, wage)
values
(111111111,1,100),
(111111111,2,100),
(222222222,1,250),
(222222222,2,250),
(333333333,1,50),
(444444444,2,400);

现在,我希望得到每个校区学生的平均工资,以及来自所有校区的学生的平均工资......所以我做了这样的事情:

select campus, avg(wage)
from #thetable
group by cube(campus);

问题在于,当我将校园分组在一起时,我不想对参加两个校区的学生进行重复计算。这是我得到的输出(双计数学生111111111和2222222222):

Campus   (no column name)
1        133
2        250
NULL     191

我想要的输出是这个(没有重复计算):

Campus   (no column name)
1        133
2        250
NULL     200

这可以在不使用多个查询和UNION运算符的情况下完成吗?如果是这样,怎么样? (顺便说一句,我意识到这个表没有规范化......会规范化帮助吗?)

2 个答案:

答案 0 :(得分:1)

您无法使用一列执行此操作。 cube将根据每行的计算汇总值。因此,如果一个计算中包含一行,它将包含在总和中。

但是,您可以通过将值加1除以频率来实现此目的。这将学生在校园内“平分”为每个学生增加1:

select campus, avg(wage) as avg_wage, sum(wage*weight) / sum(weight) avg_wage_weighted
from (select t.*, (1.0 / count(*) over (partition by SSN)) as weight
      from #thetable t
     ) t
group by cube(campus);

第二列应该是您想要的值。然后,您可以将其进一步嵌入子查询中以获得一列:

select campus, (case when campus is null then avg_wage_weighted else avg_wage end)
from (select campus, avg(wage) as avg_wage, sum(wage*weight) / sum(weight) avg_wage_weighted
      from (select t.*, (1.0 / count(*) over (partition by SSN)) as weight
            from #thetable t
           ) t
      group by cube(campus)
     ) t

Here是一个显示解决方案的SQL小提琴。

答案 1 :(得分:0)

用相关的子查询计算出来。适合我。

select campus,
(
    select avg(wage)
    from 
    (
        select ssn, campus, wage, row_number() over(partition by SSN order by wage) as RN
        from #thetable as inside
        where (inside.campus=outside.campus or outside.campus is null) 
    ) as middle
    where RN=1
)
from #thetable outside
group by cube(campus);