Question

我正在重新设计一些遗留代码并遇到了这个计算。想知道这里是否有人可以指出做这样的事情的理由是什么？作者不在公司，也没有文件。

背景是：如果员工类型被定义为最低粮食，则首先在该水平计算加权平均值，并通过再次重新计算加权平均值来累计更高粮食。

employee department employee_type   salary   weight   location
  A        X           F             1000      3.15    boston
  B        X           P              300      1.27     NY
  C        Y           F             2000      3.38     Tampa
  D        Y           P                       1.12     LA
  E        X           F              3000     3.38     SFO

用于计算部门平均工资的查询：

     select department, sum(case when avg_salary is not null then 
      avg_salary*bonus else 0 end)/sum(case when avg_salary is not null then 
     bonus else 1 end)
   from 
     (select employee,department,location,employee_type
    ,sum(weight) as bonus
    ,sum(case when salary is not null then salary*weight else 0 end)/sum(case when salary is not null then weight else 1 end) as avg_salary
    from employee
    group by employee,department,location,employee_type
    )x
    group by  department

输出：

      X 1752.69230769231
      Y 1502.22222222222

如果我们在最低粮食时汇总，然后计算更高粮食的平均工资，我们就会得到不同的价值。

所以我想问题是，这是一种正确的方法，这种方法背后的理由是什么 - 只是考虑到缺失值？

Answer 1

这是一个简单的加权平均值。（在Excel中考虑SumProduct）

您可能会注意到分母中的NULLIF（）。这是为了避免可怕的被零除。我相信你知道，但是你可以Group By任何字段组合（从原子级一直到上）。

示例

Declare @YourTable Table ([employee] varchar(50),[department] varchar(50),[employee_type] varchar(50),[salary] money,[weight] money,[location] varchar(50)) Insert Into @YourTable Values ('A','X','F',1000,3.15,'boston') ,('B','X','P',300,1.27,'NY') ,('C','Y','F',2000,3.38,'Tampa') ,('D','Y','P',null,1.12,'LA') ,('E','X','F',3000,3.38,'SFO') Select Department ,WeigtedAvg = sum(Salary*Weight)/NullIf(sum(Weight),0) From @YourTable Group By Department

<强>返回

Department WeigtedAvg X 1752.6923 Y 1502.2222

只是为了好玩

Select Department ,WeigtedAvgBonus = sum(Salary*Weight)/NullIf(sum(Weight),0) ,WeigtedAvgRate = sum(Salary*Weight)/NullIf(sum(Salary),0) From @YourTable Group By Department

<强>返回

Department WeigtedAvgBonus WeigtedAvgRate X 1752.6923 3.1793 Y 1502.2222 3.38 -- Notice this matches the only non-null observation in Y

计算较低的谷物和滚动到较高的谷物

1 个答案: