在UNION之后聚合重复项

时间:2013-03-20 15:53:34

标签: sql sql-server

我觉得可能有一种较短的查询方式。

这种结构在几个存储过程中重复 我在表的目标元素和DimDate视图之间执行CROSS JOIN - 每个度量为0。然后UNION结果与实际结果。然后在外部查询中,如果重复,则全部聚合。

有没有更有效的方法来解决这个问题?

SELECT Name,
       DateKey,
       Measure1 = SUM(Measure1),
       Measure2 = SUM(Measure2)
FROM (
    SELECT  Name,
        DateKey,
        Measure1 = SUM(Measure1),
        Measure2 = SUM(Measure2)
    FROM    WH.dbo.tb_r12028dxi_Data
    GROUP BY SearchName,
        DateKey
    UNION 
    SELECT  Name,   
        d.DateKey,
        0,
        0
    FROM    WH.dbo.vw_DimDate d
        CROSS JOIN
        WH.dbo.tb_r12028dxi_Data a  
    WHERE   d.DayMarker >= CONVERT(DATETIME,CONVERT(CHAR(6),DATEADD(MM,-24,GETDATE()),112) + '01',112)
    GROUP BY a.Name,    
        d.DateKey
    ) x
GROUP BY Name,
    DateKey

4 个答案:

答案 0 :(得分:1)

不完全确定它是否更有效,但您只能尝试在查询中执行GROUP BY / SUM一次。

SELECT Name,
       DateKey,
       Measure1 = SUM(Measure1),
       Measure2 = SUM(Measure2)
FROM (
    SELECT  SearchName AS Name,
            DateKey,
            Measure1,
            Measure2
    FROM    WH.dbo.tb_r12028dxi_Data
    UNION 
    SELECT DISTINCT
            Name,   
            d.DateKey,
            0,
            0
    FROM    WH.dbo.vw_DimDate d
            CROSS JOIN WH.dbo.tb_r12028dxi_Data a  
    WHERE   d.DayMarker >= CONVERT(DATETIME,CONVERT(CHAR(6),DATEADD(MM,-24,GETDATE()),112) + '01',112)
    ) x
GROUP BY Name,
         DateKey

答案 1 :(得分:1)

你可以做左外连接。它可能看起来不简单,但数据库评估比UNION更容易。

SELECT x.Name,
       x.DateKey,
       Measure1 = SUM(sum_table.Measure1),
       Measure2 = SUM(sum_table.Measure2)
FROM (SELECT distinct Name, d.DateKey
      FROM    WH.dbo.vw_DimDate d, WH.dbo.tb_r12028dxi_Data a  
      WHERE   d.DayMarker >= CONVERT(DATETIME,CONVERT(CHAR(6),DATEADD(MM,-24,GETDATE()),112) + '01',112)) x
  LEFT OUTER JOIN WH.dbo.tb_r12028dxi_Data sum_table
    ON x.Name = sum_table.Name AND x.DateKey = sum_table.DateKey
GROUP BY x.Name, x.DateKey

请注意,这假设您在WH.dbo.tb_r12028dxi_Data中需要的每个值都将位于交叉连接中。否则,您需要一个完整的外连接而不是左外连接。

答案 2 :(得分:1)

我不确定我是否完全正确,因为我没有可以使用的结构,但这可能更有效。它使用的是UNION ALL,它总是比UNION更有效。它能够做到这一点,因为UNION的第一部分是分组的(没有重复),第二部分是检查第一部分以确保它没有任何重复。我质疑效率的唯一原因是因为UNION的第一部分可能必须运行两次。

WITH DataResults (
SELECT  Name,
    DateKey,
    Measure1 = SUM(Measure1),
    Measure2 = SUM(Measure2) 
FROM    WH.dbo.tb_r12028dxi_Data
GROUP BY SearchName,
    DateKey
    )
SELECT * FROM DataResults
UNION ALL 
SELECT DISTINCT Name,   
    d.DateKey,
    0,
    0
FROM    WH.dbo.vw_DimDate d
CROSS JOIN WH.dbo.tb_r12028dxi_Data a  
WHERE   d.DayMarker >= CONVERT(DATETIME,CONVERT(CHAR(6),DATEADD(MM,-24,GETDATE()),112) + '01',112)
  -- Check for existence within the upper part o fthe union.
  AND NOT EXISTS (SELECT 1 FROM DataResults 
            WHERE a.Name = DataResults.Name -- I'm making an assumption here that name is in tb_r12028dxi_Data.  You didn't say.
              AND d.DateKey = DataResults.DateKey )
GROUP BY a.Name,    
    d.DateKey

答案 3 :(得分:0)

pswg方法的变体 - 我认为可能更有效:

SELECT Name,
       DateKey,
       Measure1 = SUM(Measure1),
       Measure2 = SUM(Measure2)
FROM (SELECT SearchName AS Name,
             DateKey,
             Measure1,
             Measure2
      FROM   WH.dbo.tb_r12028dxi_Data
      UNION 
      SELECT Name,   
             d.DateKey,
             0,
             0
      FROM   WH.dbo.vw_DimDate d
      CROSS JOIN (SELECT DISTINCT Name FROM WH.dbo.tb_r12028dxi_Data) a  
      WHERE d.DayMarker >= CONVERT(DATETIME,CONVERT(CHAR(6),DATEADD(MM,-24,GETDATE()),112) + '01',112)
     ) x
GROUP BY Name, DateKey