填写SQL查询结果集中的缺失(时间段)行

时间:2014-08-14 18:22:24

标签: sql sql-server date datetime timespan

更新:2014年8月15日:根据LMU92的建议/样本,查看最终的最终工作解决方案

我有一个查询金融交易表的查询。该表是详细事务的汇总,每晚重新生成,仅用于读/ SELECT。平台是SQL Server 2012。

此抽样是主查询的结果,主查询按时间段,帐户和类别返回历史SUM(金额)。报告的时间窗口是参数驱动的,对于此样本,从2014年1月1日到2014年5月31日:

    TimePeriod   Start      End        Category        Account           Amount
    ----------   --------   ---------  ------------    ----------------  ------------
    month        1/1/2014   1/31/2014  CategoryX       AccountA          2421.00
    month        4/1/2014   4/30/2014  CategoryX       AccountA          1421.00 
    month        5/1/2014   5/31/2014  CategoryY       AccountA          9421.00
    month        1/1/2014   1/31/2014  CategoryZ       AccountB          2421.00
    month        3/1/2014   3/31/2014  CategoryZ       AccountB          6421.00
    ...

我所追求的结果将通过用0.00金额填补任何缺口(无交易)期间来消除差距,例如,对于Month / AccountA / CategoryX:

    TimePeriod   Start      End        Category        Account           Amount
    ----------   --------   ---------  ------------    ----------------  ------------
    month        1/1/2014   1/31/2014  CategoryX       AccountA          2421.00
    month        2/1/2014   2/28/2014  CategoryX       AccountA             0.00
    month        3/1/2014   3/31/2014  CategoryX       AccountA             0.00
    month        4/1/2014   4/30/2014  CategoryX       AccountA          1421.00 
    month        5/1/2014   5/31/2014  CategoryX       AccountA             0.00 

挑战在于汇总是多个期间类型(日/周/月/季/年),每个类型都可以按帐户/类别细分。所有时期的总记录总数为1000万,并且随着时期变得更加分数(例如数周/天),数字会增加。

我已经尝试过一个表现不佳的CTE(尽管索引调整,它似乎处理很多),还尝试添加0.00记录,这些记录以指数方式增加了池中的记录数,因为它是每个事务少一个记录期间(d / w / m / q / y),每个帐户(从帐户的第一个交易日期开始),每个类别,需要大量的索引调整才能获得如此大的池以使我们接受/接近可接受的性能,并且还增加了执行夜间负载所需的时间。我考虑做一个立方体,但这似乎对我们正在做的事情有点过分。

我正在寻找的解决方案可以动态执行此操作,并且无需日历表。如果这是动态执行此操作的唯一有效方法,我确实有一个日历维度表。

非常感谢任何建议。

DDL /(T-)SQL的简化版本:

表:

CREATE TABLE TxnRollups (
    TxnTimePeriod          VARCHAR(10), --Year/Quarter/Month/Week/Day
    TxnPeriodStartDate     DATE,
    TxnPeriodEndDate       DATE,
    TxnAccountID           VARCHAR(10),
    TxnAccountType         VARCHAR(20),
    TxnAccountName         VARCHAR(20),
    TxnAccountHierL1       VARCHAR(20),
    TxnAccountHierL2       VARCHAR(20),       
    TxnAccountHierL3       VARCHAR(20),
    TxnCategory            VARCHAR(20),
    Amount                 DECIMAL(16,3)
)

查询:

    CREATE PROCEDURE GetTxnByPeriod(@FromDate DATE, @ToDate DATE, @SummaryPeriod VARCHAR(20)) 
      AS
        BEGIN
                SELECT TxnR.TxnTimePeriod   TimePeriod,
                       TxnR.TxnPeriodStart  Start,
                       TxnR.TxnPeriodEnd    End,
                       TxnR.TxnCategory     Category,
                       TxnR.TxnAccountName  Account,
                       SUM(TxnRAmount)      Amount
                From TxnRollups TxnR
                WHERE 
                       TxnR.TxnTImePeriod = @SummaryPeriod AND
                       TxnR.TxnPeriodEnd BETWEEN @FromDate AND @ToDate
                GROUP BY
                       TxnR.TxnTimePeriod,
                       TxnR.TxnPeriodStart,
                       TxnR.TxnPeriodEnd,
                       TxnR.TxnCategory,
                       TxnR.TxnAccountName

    END

更新:2014年8月15日:最终工作解决方案

概述

通过以下方法,我能够生成所需的结果集,并且无需使用汇总,而是直接查询原始数据集。最重要的是:

执行时间:1k-1.5k ms ,具体取决于返回的数据量。我们甚至没有从我们的汇总中获得这种表现。

结构:

  1. 根据LMU92的初步建议,使用FirstdayofMonth / LastdayofMonth创建Months表
  2. 使用第一个/上个月列更改我们的交易详细信息原始(交易详细信息)数据集,并更新其内容以匹配每个相应行日期的当月的第一天/最后一天。
  3. 向两个表添加了数字(FLOAT)列,并将First / Last日期表示存储为数字。这样做的目的是通过使用浮动进行日期比较来提高性能。
  4. 在数字日期列上创建索引。
  5. 在一组唯一列(帐户ID /类别)和月份表上使用带有交叉连接的CTE。 6 CTE结果到#Temp表。 7在#Temp表上创建非聚集索引 8从事务详细信息表中选择一个不同的日期/帐户/类别/总和(金额)列表到另一个#Temp表中。 9添加了索引 10完整日期/帐户表的正确外部联接,包含交易明细,根据需要得出结果(见问题)。
  6. **这是代码的缩写版本(某些调试,非必要的删除)**

    CREATE PROCEDURE    GetTransactionsByMonth
                        (
                                @FromDate                   DATE                =           NULL        , 
                                @ToDate                     DATE                =           NULL        ,
                                @TxnCategory                VARCHAR(7)          =           'CAT1'      ,
                                @AccountType1               VARCHAR(21)         =           NULL        ,
                                @AccountType2               VARCHAR(21)         =           NULL        ,
                                @AccountType3               VARCHAR(21)         =           NULL        ,
                                @Debug                      BIT                 =           0           
                        ) 
    
    WITH RECOMPILE
    
                AS
                    BEGIN
    
                        DECLARE @True   AS BIT = 1
                        DECLARE @False  AS BIT = 0
    
                        PRINT IIF(@Debug = @True, 'START Procedure - ' + CAST(SYSDATETIMEOFFSET() AS VARCHAR),NULL)
                        IF @Debug = @True
                            BEGIN
                                SET STATISTICS TIME ON
                                SET STATISTICS IO   ON
                                DECLARE @NoCountState int = @@OPTIONS & 512;
                                SET NOCOUNT OFF;
                            END
                        /*================================================================================================================
                            Initialization - Declarations
                          ================================================================================================================*/
                        DECLARE         @MinDateN                       FLOAT
                        DECLARE         @MaxDateN                       FLOAT
                        DECLARE         @MinDateD                       DATE
                        DECLARE         @MaxDateD                       DATE
                        /*================================================================================================================
                            Initialization - Establish Date Ranges
                          ================================================================================================================*/
                        SET @FromDate   =   DATEADD(MM, DATEDIFF(MM, 0, @FromDate), 0)   --Set @FromDate to first of the requested month
                        SET @MinDateN   =   FLOOR(CAST(CAST(@FromDate as DateTime) as float)) 
                        SET @ToDate     =   DATEADD(MM, DATEDIFF(MM, 0, @ToDate), 0)     --Set @ToDate to first of the requested month
                        SET @MaxDateN   =   FLOOR(CAST(CAST(@ToDate as DateTime) as float)) 
                        SET @MinDateD = CAST(FLOOR(CAST(@MinDateN AS FLOAT)) AS DATETIME)  --For output only
                        SET @MaxDateD = CAST(FLOOR(CAST(@MaxDateN AS FLOAT)) AS DATETIME) --For output only
    
                        ; WITH CTE_MonthsRollUp AS (
    
                            SELECT DISTINCT
                                            MonthsTable.NFirstDayOfMonth        AS      PeriodStartN            ,
                                            MonthsTable.NLastDayOfMonth         AS      PeriodEndN              ,
                                            MonthsTable.FirstDayOfMonth         AS      PeriodStartD            ,
                                            MonthsTable.LastDayOfMonth          AS      PeriodEndD              ,
                                            TransactionDetail.AccountId         AS      AccountId           ,
                                            TransactionDetail.AcctCategory      AS      AcctCategory
    
                                      FROM  dbo.Months                          AS      MonthsTable
    
                                CROSS JOIN  dbo.tblTxnDetail                    AS      TransactionDetail
    
                                     WHERE  TransactionDetail.AccountId     IS      NOT NULL
                                     AND    TransactionDetail.AcctCategory       =      @AcctCategory
                                     AND    TransactionDetail.AccountType       IN      (
                                                                                            @AccountType1       ,
                                                                                            @AccountType2       ,
                                                                                            @AccountType3       ,
                                                                                        )
    
                                       AND  MonthsTable.NFirstDayOfMonth        <=      @MaxDateN
                                       AND  MonthsTable.NLastDayOfMonth         >=      @MinDateN
                                       AND  MonthsTable.NLastDayOfMonth         >=      (
                                                                                            SELECT  MIN(NFirstDayOfMonth)           AS  EarliestTxnDateN
                                                                                            FROM    tblTxnDetail        AS  ValidateAccount
                                                                                            WHERE   ValidateAccount.AccountId   =   TransactionDetail.AccountId
                                                                                        )
                            GROUP BY
                                            MonthsTable.NFirstDayOfMonth                                        ,
                                            MonthsTable.NLastDayOfMonth                                         ,
                                            MonthsTable.FirstDayOfMonth                                         ,
                                            MonthsTable.LastDayOfMonth                                          ,
                                            TransactionDetail.AccountId                                     ,
                                            TransactionDetail.AcctCategory
    
                        )
    
                        SELECT      MonthsRollupResults.* 
                          INTO      #TMonthsRollup
                          FROM      CTE_MonthsRollUp MonthsRollupResults
                        OPTION     (RECOMPILE)
    
                        ;           
                        CREATE NONCLUSTERED INDEX   [#idxTMonthsRollup_AccountIDandTxnCategory_Join]
                                               ON   [dbo].[#TMonthsRollup] ([AccountId],[AcctCategory], PeriodStartN)
                        INCLUDE                     (PeriodStartD, PeriodEndD)
                        ;
    
                        ; WITH CTE_AccountList  AS  (
                                                        SELECT DISTINCT     DistinctAccountList.NFirstDayOfMonth        AS      PeriodStartN            ,
                                                                            DistinctAccountList.NLastDayOfMonth         AS      PeriodEndN              ,
                                                                            DistinctAccountList.AccountId               AS      AccountId               , 
                                                                            DistinctAccountList.AcctCategory            AS      AcctCategory            ,
                                                                            DistinctAccountList.AccountType             AS      AccountType             ,
                                                                            DistinctAccountList.Account                 AS      AccountName             ,
                                                                            DistinctAccountList.ACCOUNTL1               AS      AccountHierarchyL1      ,
                                                                            DistinctAccountList.ACCOUNTL2               AS      AccountHierarchyL2      ,
                                                                            DistinctAccountList.ACCOUNTL3               AS      AccountHierarchyL3      ,
                                                                            SUM(DistinctAccountList.Amount)             AS      PeriodAmount        
    
    
                                                                   FROM     tblTxnDetail                                AS      DistinctAccountList
    
                                                                  WHERE     DistinctAccountList.NFirstDayOfMonth        <=      @MaxDateN
                                                                    AND     DistinctAccountList.NLastDayOfMonth         >=      @MinDateN
                                                                    AND     DistinctAccountList.AccountId               IS      NOT NULL
                                                                    AND     DistinctAccountList.AcctCategory            =       @AcctCategory
                                                                    AND     DistinctAccountList.AccountType             IN      (
                                                                                                                                    @AccountType1       ,
                                                                                                                                    @AccountType2       ,
                                                                                                                                    @AccountType3       ,
    
                                                                                                                                )
    
                                                               GROUP BY     DistinctAccountList.NFirstDayOfMonth                                ,
                                                                            DistinctAccountList.NLastDayOfMonth                                 ,
                                                                            DistinctAccountList.AccountId                                       ,
                                                                            DistinctAccountList.AcctCategory                                    ,
                                                                            DistinctAccountList.AccountType                                     ,
                                                                            DistinctAccountList.Account                                         ,
                                                                            DistinctAccountList.ACCOUNTL1                                       ,
                                                                            DistinctAccountList.ACCOUNTL2                                       ,
                                                                            DistinctAccountList.ACCOUNTL3                                       
    
    
                        ) 
    
    
                        SELECT      DistinctAccountList.* 
                          INTO      #TAccountList
                          FROM      CTE_AccountList DistinctAccountList
                        ;
    
                        CREATE NONCLUSTERED INDEX   [#idxTAccountList_DistincAccountDetailsList_For_Join]
                                               ON   [dbo].[#TAccountList] ([AccountId],[AcctCategory], PeriodStartN)
                        INCLUDE                     (AccountName, PeriodCredit, PeriodDebit, PeriodAmount, AccountType, 
                                                     AccountHierarchyL1,AccountHierarchyL2,AccountHierarchyL3,AccountHierarchyL4,
                                                     AccountHierarchyL5, BSReportHierarchyL1, BSReportHierarchyL2, BSReportHierarchyL3,
                                                     BSReportHierarchyL4, PLReportHierarchyL1, PLReportHierarchyL2, PLReportHierarchyL3
                                                    )
                        ;
                        SELECT DISTINCT         'Month'                             AS      Period              ,
                                                @MinDateD                           AS      ReportStart         ,
                                                @MaxDateD                           AS      ReportEnd           ,   
                                                tMonthRollup.AccountId              AS      AccountId           , 
                                                tMonthRollup.AcctCategory           AS      AcctCategory        , 
                                                tMonthRollup.PeriodStartD           AS      PeriodStart         , 
                                                tMonthRollup.PeriodEndD             AS      PeriodEnd           , 
                                                AccountList.AccountName             AS      AccountName         ,
                                                AccountList.PeriodAmount            AS      PeriodAmount        ,
                                                AccountList.AccountType             AS      AccountType         ,
                                                AccountList.AccountHierarchyL1      AS      AccountHierarchyL1  ,
                                                AccountList.AccountHierarchyL2      AS      AccountHierarchyL2  ,
                                                AccountList.AccountHierarchyL3      AS      AccountHierarchyL3  
    
    
                                   FROM         #TAccountList                       AS      AccountList
                       RIGHT OUTER JOIN         #TMonthsRollup                      AS      tMonthRollup
    
                                     ON         AccountList.AccountId           =       tMonthRollup.AccountId
                                    AND         Accountlist.AcctCategory                =       tMonthRollup.AcctCategory
                                    AND         AccountList.PeriodStartN            =       tMonthRollup.PeriodStartN
    
                --------------------------------------------------------
                PRINT IIF(@Debug = @True, 'END PROCEDURE - ' + CAST(SYSDATETIMEOFFSET() AS VARCHAR),NULL)
                --------------------------------------------------------
                IF @Debug = @True
                    BEGIN
                        SET STATISTICS TIME OFF
                        SET STATISTICS IO   OFF
                        IF @NoCountState <> 0 
                            SET NOCOUNT ON
                    END
    END
    

2 个答案:

答案 0 :(得分:1)

这是一个编码版本,如何执行汇总并将其与日历表联系起来:

            WITH cte_Rollup as
           (SELECT TxnR.TxnTimePeriod   TimePeriod,
                   TxnR.TxnPeriodStart  Start,
                   TxnR.TxnPeriodEnd    End,
                   TxnR.TxnCategory     Category,
                   TxnR.TxnAccountName  Account,
                   SUM(TxnRAmount)      Amount
            From TxnRollups TxnR
            WHERE 
                   TxnR.TxnTImePeriod = @SummaryPeriod AND
                   TxnR.TxnPeriodEnd BETWEEN @FromDate AND @ToDate
            GROUP BY
                   TxnR.TxnTimePeriod,
                   TxnR.TxnPeriodStart,
                   TxnR.TxnPeriodEnd,
                   TxnR.TxnCategory,
                   TxnR.TxnAccountName
            ), cte_Calendar AS
            (
            SELECT cal.Period, Min(cal.PeriodDate) as PeriodStart, Max(cal.PeriodDate) as PeriodEnd
            FROM calendar cal
            WHERE  cal.PeriodDate BETWEEN @FromDate AND @ToDate AND cal.Period = @SummaryPeriod
            GROUP BY cal.TImePeriod,cal.MonthValue
            )
            SELECT *
            FROM cte_Calendar
            LEFT OUTER JOIN cte_Rollup  ON cte_Calendar.PeriodStart = cte_Rollup.TxnPeriodStart

答案 1 :(得分:0)

我建议使用仅包含列的索引月份表:开始日期和结束日期。 然后在TxnRollups表上执行简单的左外连接。 将月份表(StartDate和EndDate)中的聚集索引添加到TxnPeriodStart和TxnPeriodEnd上的TxnRollups上的noclustered索引将有助于进一步加快速度。