优化多个分层组的SUM OVER PARTITION BY

时间:2018-05-24 19:29:12

标签: sql sql-server window-functions partition-by

我有一张如下表:

Region    Country    Manufacturer    Brand    Period    Spend
R1        C1         M1              B1       2016      5
R1        C1         M1              B1       2017      10
R1        C1         M1              B1       2017      20
R1        C1         M1              B2       2016      15
R1        C1         M1              B3       2017      20
R1        C2         M1              B1       2017      5
R1        C2         M2              B4       2017      25
R1        C2         M2              B5       2017      30
R2        C3         M1              B1       2017      35
R2        C3         M2              B4       2017      40
R2        C3         M2              B5       2017      45

我需要在不同的群组中找到SUM([Spend],如下所示:

  1. 在整个表格中所有行的总支出
  2. 每个区域的总支出
  3. 每个地区和国家组的总支出
  4. 每个区域,国家/地区和广告商组的总支出
  5. 所以我在下面写了这个查询:

    SELECT 
        [Period]
        ,[Region]
        ,[Country]
        ,[Manufacturer]
        ,[Brand]
        ,SUM([Spend]) OVER (PARTITION BY [Period]) AS [SumOfSpendWorld]
        ,SUM([Spend]) OVER (PARTITION BY [Period], [Region]) AS [SumOfSpendRegion]
        ,SUM([Spend]) OVER (PARTITION BY [Period], [Region], [Country]) AS [SumOfSpendCountry]
        ,SUM([Spend]) OVER (PARTITION BY [Period], [Region], [Country], [Manufacturer]) AS [SumOfSpendManufacturer]
    FROM myTable
    

    但对于只有450K行的表,该查询需要> 15分钟。我想知道是否有任何方法可以优化这种性能。谢谢你的高级答案/建议!

3 个答案:

答案 0 :(得分:4)

您对问题的描述建议我grouping sets

SELECT YEAR([Period]) AS [Period], [Region], [Country], [Manufacturer], 
       SUM([Spend])
GROUP BY GROUPING SETS ( (YEAR([Period]),
                         (YEAR([Period]), [Region]),
                         (YEAR([Period]), [Region], [Country]), 
                         (YEAR([Period]), [Region], [Country], [Manufacturer])
                        );

我不知道这会更快,但它似乎更符合您的问题。

答案 1 :(得分:1)

使用cross apply here来加快查询速度:

 SELECT 
     periodyear
    ,[Region]
    ,[Country]
    ,[Manufacturer]
    ,[Brand]
    ,SUM([Spend]) OVER (PARTITION BY  periodyear AS [SumOfSpendWorld]
    ,SUM([Spend]) OVER (PARTITION BY  periodyear, [Region]) AS [SumOfSpendRegion]
    ,SUM([Spend]) OVER (PARTITION BY  periodyear, [Region], [Country]) AS [SumOfSpendCountry]
    ,SUM([Spend]) OVER (PARTITION BY  periodyear, [Region], [Country], [Manufacturer]) AS [SumOfSpendManufacturer]
FROM myTable
  cross apply (select YEAR([Period]) periodyear) a

答案 2 :(得分:1)

SUM() OVER()的旧学校:

SELECT 
      [Period]
    , [Region]
    , [Country]
    , [Manufacturer]
    , [Brand]
    , (SELECT SUM([Spend]) FROM myTable t WHERE e.[Period] = t.[Period] GROUP BY [Period]) AS [SumOfSpendWorld]
    , (SELECT SUM([Spend]) FROM myTable t WHERE e.[Period] = t.[Period] AND e.Region = t.Region GROUP BY [Period], [Region] ) AS [SumOfSpendRegion]
    , (SELECT SUM([Spend]) FROM myTable t WHERE e.[Period] = t.[Period] AND e.Region = t.Region AND e.Country = t.Country GROUP BY [Period], [Region], [Country] ) AS [SumOfSpendCountry]
    , (SELECT SUM([Spend]) FROM myTable t WHERE e.[Period] = t.[Period] AND e.Region = t.Region AND e.Country = t.Country AND e.Manufacturer = t.Manufacturer GROUP BY [Period], [Region], [Country], [Manufacturer] ) AS [SumOfSpendManufacturer]
FROM myTable e

虽然这不是优雅的方式,但它完成了工作。我强烈建议查看表并分析它,看看哪种替代方法最适合您的情况。如果你认为它是一个死胡同,那么我建议使用临时表来加快速度。 例如,您可以根据句点选择行,并使用批量复制将它们直接插入临时表,然后进行魔术。我见过表强迫我使用临时表而不是简单的选择查询。其他人强迫我把桌子分成两张桌子。

所以,它并不总是很干净!

我希望这会给你另一个有助于你旅程的见解。