SQL计算属性总和的平均值

时间:2015-03-12 00:41:25

标签: mysql sql join group-by aggregate-functions

我正在尝试获取某个类别的值的平均值,其中行按子类别按计算的总和进行分组。父表的主键是子表的分组属性。父表的分组属性既不是主键,也不是子表。

简单表示:

select Category, avg(CalculatedSum)
from ParentTable pt
inner join (
    select Subcategory, sum(Quantity * Price) as 'CalculatedSum'
    from ChildTable
    group by Subcategory
    ) ct
on pt.ID = ct.Subcategory
group by Category

实际的SQL如下:

select c.CU_AGE_RANGE, count(*) as '# of Customers', avg(SumSales) as 'Avg of SumSales', max([Max of SumSales]) as 'Max of SumSales', min([Min of SumSales]) as 'Min of SumSales'
from Customers c
inner join (
    select CUSTOMER_ID, sum(QTY_SOLD * SALES) as SumSales, max(QTY_SOLD*SALES) as 'Max of SumSales', min(QTY_SOLD*SALES) as 'Min of SumSales'
    from Sales
    where (SALES > 0) and (QTY_SOLD > 0) and (COST > 0)
    Group by CUSTOMER_ID
    ) s
on c.CUSTOMER_ID = s.CUSTOMER_ID
group by c.CU_AGE_RANGE

我尝试将group by子句更改为Category(CU_AGE_RANGE)和Subcategory(CUSTOMER_ID)的各种顺序,但我总是遇到相同的错误。

错误是表总是会显示SUMS的SUM(我相信)。我假设这是错误,因为子表中的典型平均值是250到1000,而Avg(Sum())返回的值大致是每个Category的行数乘以预期的Sum()。

由于声誉不佳,我无法发布照片,因此请参阅以下逗号分隔结果表:

CU_AGE_RANGE,#_of_Customers,Avg_of_SumSales,Max_of_SumSales,Min_of_SumSales
NULL,125,4261665.306,433460737.7,0.0017
20-29     ,1192,1154040.907,1374037708,0.00025
30-39     ,1902,25429.52329,29426212.64,0.00015
40-49     ,2118,2418.829874,2066725,0.0001
50-59     ,2204,114625.4111,248240261.3,0.00015
60+       ,2135,160156.4341,334617675,0.0005
patrickbig,1,65.5737,12,0.06
Under 19  ,484,1431.262112,92160,0.0001

我试图找出为什么AVG(SUM())返回似乎是SUM(SUM())的原因。我当前的预感是,由于SUM()是计算条目,因此根据父表中的分组重新计算计算值。所以这将是:

DESIRED:

x * y for each row in Child Table
sum(x*y) for each Subcategory
Avg(sum(x/y)) for each Category of Subcategory

QTY_SOLD * SALE for each row in Sales
sum(QTY_SOLD*SALE) for each CUSTOMER_ID
avg(sum(QTY_SOLD*SALE) for each CU_AGE_RANGE group of CUSTOMER_IDs

ACTUAL:

x * y for each row in Child Table                  
sum(x * y) for each Subcategory
avg(sum(x * y) for each Category

avg(sum(QTY_SOLD*SALE) for each CU_AGE_RANGE

等于:

sum(QTY_SOLD*SALE) for each CU_AGE_RANGE

如何从当前(类别总和)获得所需(按子类别的总和类别平均)?

2 个答案:

答案 0 :(得分:0)

您的客户数量是错误的。您计算的是销售数量,而不是客户数量。更改为count( DISTINCT c.CUSTOMER_ID )应解决此问题。

select c.CU_AGE_RANGE, count( DISTINCT c.CUSTOMER_ID ) as '# of Customers', avg(SumSales) as 'Avg of SumSales', max([Max of SumSales]) as 'Max of SumSales', min([Min of SumSales]) as 'Min of SumSales'
from Customers c
inner join (
    select CUSTOMER_ID, sum(QTY_SOLD * SALES) as SumSales, max(QTY_SOLD*SALES) as 'Max of SumSales', min(QTY_SOLD*SALES) as 'Min of SumSales'
    from Sales
    where (SALES > 0) and (QTY_SOLD > 0) and (COST > 0)
    Group by CUSTOMER_ID
    ) s
on c.CUSTOMER_ID = s.CUSTOMER_ID
group by c.CU_AGE_RANGE

答案 1 :(得分:0)

让我们首先考虑子查询:

select Subcategory, sum(Quantity * Price) as 'CalculatedSum'
from ChildTable
group by Subcategory

结果关系的每条记录都代表Subcategory的聚合。现在,avg(CalculatedSum)应该产生CalculatedSum值的平均值。尝试计算sum(CalculatedSum),看看是否存在差异。