SQL加权平均值

时间:2016-10-17 03:23:52

标签: sql postgresql

有一张如下表格。

make   | model | engine | cars_checked | avg_mileage
---------------------------------------|--------
suzuki | sx4   | petrol | 11           | 12
suzuki | sx4   | diesel | 150          | 16
suzuki | swift | petrol | 140          | 15
suzuki | swift | diesel | 18           | 19
toyota | prius | petrol | 16           | 17
toyota | prius | hybrid | 250          | 24

所需的输出是

  1. 发动机的平均里程(汽油,柴油)
  2. 乘以平均里程
  3. 按型号计算的平均里程数
  4. 无法做一个简单的group by,因为需要考虑每个记录(cars_checked)的样本数量的权重年龄,以避免平均值问题的平均值。

    实现它的正确方法是什么?有没有办法考虑在group by中进行加权平均的样本数?

    更新 - 为上面的#1添加的输出格式为例

    engine   | mileage_by_engine
    --------------------------
    petrol   | xx.z
    diesel   | yy.z
    

2 个答案:

答案 0 :(得分:5)

SELECT engine, SUM(cars_checked * avg_mileage) / SUM(cars_checked) AS avgMilageByEngine
FROM [YOUR_TABLE]
GROUP BY engine

SELECT make, SUM(cars_checked * avg_mileage) / SUM(cars_checked) AS avgMilageByMake
FROM [YOUR_TABLE]
GROUP BY make

SELECT model, SUM(cars_checked * avg_mileage) / SUM(cars_checked) AS avgMilageByModel
FROM [YOUR_TABLE]
GROUP BY model

答案 1 :(得分:2)

简化查询的一种方法是使用grouping sets

select engine, make, model,
       sum(cars_check * avg_mileage) / sum(cars_checked) as avgMilage
from t
group by grouping sets ((engine), (make), (model));

输出格式仅在聚合 列中具有非NULL值。