折叠并汇总组内的计数

时间:2015-04-02 20:30:07

标签: sql sql-server tsql

我有以下数据集:

SalesPerson PackageHistoryID    PackageID   SalesPersonID   EnrollmentAmount    PackageType
-------------------------------------------------------------------------------------------
Jim Jones   2895                310         59019           27.15               New Member
Jim Jones   2895                310         59019           53.21               New Member
Jim Jones   2895                310         59019           42.35               New Member
Jim Jones   2916                221         59019           379.01              Renewal
Jim Jones   2932                326         59019           53.21               New Member
Jim Jones   2932                326         59019           27.15               New Member
Jim Jones   2933                326         59019           53.21               Renewal
Jim Jones   2933                326         59019           27.15               Renewal

根据该数据集,我运行以下查询:

select Salesperson, PackageType, count(*) AS Packages, sum(EnrollmentAmount) AS Enrollment
from Sales2
group by SalesPerson, PackageType
order by SalesPerson, PackageType

......我得到了这些结果:

Salesperson    PackageType    Packages     Enrollment
----------------------------------------------------
Jim Jones      New Member     5            203.07
Jim Jones      Renewal        3            459.37

我上面显示的最终结果几乎是完美的。唯一的问题是Packages列中的计数。而不是5和3,计数应该是2和2,因为我希望它指示每个PackageHistoryID的PackageTypes数,而不是每个EnrollmentAmount。我想要对EnrollmentAmounts求和,以便压缩记录,使得PackageHistoryID永远不会重复。显示的第一个数据集显示PackageHistory记录和EnrollmentAmount之间的1-many关系。我认为我的第二个查询(分组依据)会正确地聚合这个但是你可以看到它显示了8个总的PackageHistories它真的应该只显示4个。

以下是最终结果集的外观:

Salesperson    PackageType    Packages     Enrollment
----------------------------------------------------
Jim Jones      New Member     2            203.07
Jim Jones      Renewal        2            459.37

2和2表示结果集中实际上只有4个PackageHistory记录; 2是新会员,2是续会。多个EnrollmentAmount记录导致记录太多,因此在最终查询中错误地扩展了计数。

重要提示:虽然SalesPerson在显示的结果中始终相同,但有时可能会有所不同,但对于任何给定的PackageHistory(1-1)它们都是相同的。分组需要(1)通过SalesPerson,然后(2)通过PackageType,并在每个唯一的PackageHistory中汇总/展平EnrollmentAmounts。

哪些查询会给我正确的结果?

1 个答案:

答案 0 :(得分:7)

您应该count(distinct PackageHistoryID)代替count(*)

select Salesperson, PackageType, count(distinct PackageHistoryID) AS Packages,
       sum(EnrollmentAmount) AS Enrollment
from Sales2
group by SalesPerson, PackageType
order by SalesPerson, PackageType