更新:2014年8月15日:根据LMU92的建议/样本,查看最终的最终工作解决方案
我有一个查询金融交易表的查询。该表是详细事务的汇总,每晚重新生成,仅用于读/ SELECT。平台是SQL Server 2012。
此抽样是主查询的结果,主查询按时间段,帐户和类别返回历史SUM(金额)。报告的时间窗口是参数驱动的,对于此样本,从2014年1月1日到2014年5月31日:
TimePeriod Start End Category Account Amount
---------- -------- --------- ------------ ---------------- ------------
month 1/1/2014 1/31/2014 CategoryX AccountA 2421.00
month 4/1/2014 4/30/2014 CategoryX AccountA 1421.00
month 5/1/2014 5/31/2014 CategoryY AccountA 9421.00
month 1/1/2014 1/31/2014 CategoryZ AccountB 2421.00
month 3/1/2014 3/31/2014 CategoryZ AccountB 6421.00
...
我所追求的结果将通过用0.00金额填补任何缺口(无交易)期间来消除差距,例如,对于Month / AccountA / CategoryX:
TimePeriod Start End Category Account Amount
---------- -------- --------- ------------ ---------------- ------------
month 1/1/2014 1/31/2014 CategoryX AccountA 2421.00
month 2/1/2014 2/28/2014 CategoryX AccountA 0.00
month 3/1/2014 3/31/2014 CategoryX AccountA 0.00
month 4/1/2014 4/30/2014 CategoryX AccountA 1421.00
month 5/1/2014 5/31/2014 CategoryX AccountA 0.00
挑战在于汇总是多个期间类型(日/周/月/季/年),每个类型都可以按帐户/类别细分。所有时期的总记录总数为1000万,并且随着时期变得更加分数(例如数周/天),数字会增加。
我已经尝试过一个表现不佳的CTE(尽管索引调整,它似乎处理很多),还尝试添加0.00记录,这些记录以指数方式增加了池中的记录数,因为它是每个事务少一个记录期间(d / w / m / q / y),每个帐户(从帐户的第一个交易日期开始),每个类别,需要大量的索引调整才能获得如此大的池以使我们接受/接近可接受的性能,并且还增加了执行夜间负载所需的时间。我考虑做一个立方体,但这似乎对我们正在做的事情有点过分。
我正在寻找的解决方案可以动态执行此操作,并且无需日历表。如果这是动态执行此操作的唯一有效方法,我确实有一个日历维度表。
非常感谢任何建议。
DDL /(T-)SQL的简化版本:
表:
CREATE TABLE TxnRollups (
TxnTimePeriod VARCHAR(10), --Year/Quarter/Month/Week/Day
TxnPeriodStartDate DATE,
TxnPeriodEndDate DATE,
TxnAccountID VARCHAR(10),
TxnAccountType VARCHAR(20),
TxnAccountName VARCHAR(20),
TxnAccountHierL1 VARCHAR(20),
TxnAccountHierL2 VARCHAR(20),
TxnAccountHierL3 VARCHAR(20),
TxnCategory VARCHAR(20),
Amount DECIMAL(16,3)
)
查询:
CREATE PROCEDURE GetTxnByPeriod(@FromDate DATE, @ToDate DATE, @SummaryPeriod VARCHAR(20))
AS
BEGIN
SELECT TxnR.TxnTimePeriod TimePeriod,
TxnR.TxnPeriodStart Start,
TxnR.TxnPeriodEnd End,
TxnR.TxnCategory Category,
TxnR.TxnAccountName Account,
SUM(TxnRAmount) Amount
From TxnRollups TxnR
WHERE
TxnR.TxnTImePeriod = @SummaryPeriod AND
TxnR.TxnPeriodEnd BETWEEN @FromDate AND @ToDate
GROUP BY
TxnR.TxnTimePeriod,
TxnR.TxnPeriodStart,
TxnR.TxnPeriodEnd,
TxnR.TxnCategory,
TxnR.TxnAccountName
END
更新:2014年8月15日:最终工作解决方案
概述
通过以下方法,我能够生成所需的结果集,并且无需使用汇总,而是直接查询原始数据集。最重要的是:
执行时间:1k-1.5k ms ,具体取决于返回的数据量。我们甚至没有从我们的汇总中获得这种表现。
结构:
**这是代码的缩写版本(某些调试,非必要的删除)**
CREATE PROCEDURE GetTransactionsByMonth
(
@FromDate DATE = NULL ,
@ToDate DATE = NULL ,
@TxnCategory VARCHAR(7) = 'CAT1' ,
@AccountType1 VARCHAR(21) = NULL ,
@AccountType2 VARCHAR(21) = NULL ,
@AccountType3 VARCHAR(21) = NULL ,
@Debug BIT = 0
)
WITH RECOMPILE
AS
BEGIN
DECLARE @True AS BIT = 1
DECLARE @False AS BIT = 0
PRINT IIF(@Debug = @True, 'START Procedure - ' + CAST(SYSDATETIMEOFFSET() AS VARCHAR),NULL)
IF @Debug = @True
BEGIN
SET STATISTICS TIME ON
SET STATISTICS IO ON
DECLARE @NoCountState int = @@OPTIONS & 512;
SET NOCOUNT OFF;
END
/*================================================================================================================
Initialization - Declarations
================================================================================================================*/
DECLARE @MinDateN FLOAT
DECLARE @MaxDateN FLOAT
DECLARE @MinDateD DATE
DECLARE @MaxDateD DATE
/*================================================================================================================
Initialization - Establish Date Ranges
================================================================================================================*/
SET @FromDate = DATEADD(MM, DATEDIFF(MM, 0, @FromDate), 0) --Set @FromDate to first of the requested month
SET @MinDateN = FLOOR(CAST(CAST(@FromDate as DateTime) as float))
SET @ToDate = DATEADD(MM, DATEDIFF(MM, 0, @ToDate), 0) --Set @ToDate to first of the requested month
SET @MaxDateN = FLOOR(CAST(CAST(@ToDate as DateTime) as float))
SET @MinDateD = CAST(FLOOR(CAST(@MinDateN AS FLOAT)) AS DATETIME) --For output only
SET @MaxDateD = CAST(FLOOR(CAST(@MaxDateN AS FLOAT)) AS DATETIME) --For output only
; WITH CTE_MonthsRollUp AS (
SELECT DISTINCT
MonthsTable.NFirstDayOfMonth AS PeriodStartN ,
MonthsTable.NLastDayOfMonth AS PeriodEndN ,
MonthsTable.FirstDayOfMonth AS PeriodStartD ,
MonthsTable.LastDayOfMonth AS PeriodEndD ,
TransactionDetail.AccountId AS AccountId ,
TransactionDetail.AcctCategory AS AcctCategory
FROM dbo.Months AS MonthsTable
CROSS JOIN dbo.tblTxnDetail AS TransactionDetail
WHERE TransactionDetail.AccountId IS NOT NULL
AND TransactionDetail.AcctCategory = @AcctCategory
AND TransactionDetail.AccountType IN (
@AccountType1 ,
@AccountType2 ,
@AccountType3 ,
)
AND MonthsTable.NFirstDayOfMonth <= @MaxDateN
AND MonthsTable.NLastDayOfMonth >= @MinDateN
AND MonthsTable.NLastDayOfMonth >= (
SELECT MIN(NFirstDayOfMonth) AS EarliestTxnDateN
FROM tblTxnDetail AS ValidateAccount
WHERE ValidateAccount.AccountId = TransactionDetail.AccountId
)
GROUP BY
MonthsTable.NFirstDayOfMonth ,
MonthsTable.NLastDayOfMonth ,
MonthsTable.FirstDayOfMonth ,
MonthsTable.LastDayOfMonth ,
TransactionDetail.AccountId ,
TransactionDetail.AcctCategory
)
SELECT MonthsRollupResults.*
INTO #TMonthsRollup
FROM CTE_MonthsRollUp MonthsRollupResults
OPTION (RECOMPILE)
;
CREATE NONCLUSTERED INDEX [#idxTMonthsRollup_AccountIDandTxnCategory_Join]
ON [dbo].[#TMonthsRollup] ([AccountId],[AcctCategory], PeriodStartN)
INCLUDE (PeriodStartD, PeriodEndD)
;
; WITH CTE_AccountList AS (
SELECT DISTINCT DistinctAccountList.NFirstDayOfMonth AS PeriodStartN ,
DistinctAccountList.NLastDayOfMonth AS PeriodEndN ,
DistinctAccountList.AccountId AS AccountId ,
DistinctAccountList.AcctCategory AS AcctCategory ,
DistinctAccountList.AccountType AS AccountType ,
DistinctAccountList.Account AS AccountName ,
DistinctAccountList.ACCOUNTL1 AS AccountHierarchyL1 ,
DistinctAccountList.ACCOUNTL2 AS AccountHierarchyL2 ,
DistinctAccountList.ACCOUNTL3 AS AccountHierarchyL3 ,
SUM(DistinctAccountList.Amount) AS PeriodAmount
FROM tblTxnDetail AS DistinctAccountList
WHERE DistinctAccountList.NFirstDayOfMonth <= @MaxDateN
AND DistinctAccountList.NLastDayOfMonth >= @MinDateN
AND DistinctAccountList.AccountId IS NOT NULL
AND DistinctAccountList.AcctCategory = @AcctCategory
AND DistinctAccountList.AccountType IN (
@AccountType1 ,
@AccountType2 ,
@AccountType3 ,
)
GROUP BY DistinctAccountList.NFirstDayOfMonth ,
DistinctAccountList.NLastDayOfMonth ,
DistinctAccountList.AccountId ,
DistinctAccountList.AcctCategory ,
DistinctAccountList.AccountType ,
DistinctAccountList.Account ,
DistinctAccountList.ACCOUNTL1 ,
DistinctAccountList.ACCOUNTL2 ,
DistinctAccountList.ACCOUNTL3
)
SELECT DistinctAccountList.*
INTO #TAccountList
FROM CTE_AccountList DistinctAccountList
;
CREATE NONCLUSTERED INDEX [#idxTAccountList_DistincAccountDetailsList_For_Join]
ON [dbo].[#TAccountList] ([AccountId],[AcctCategory], PeriodStartN)
INCLUDE (AccountName, PeriodCredit, PeriodDebit, PeriodAmount, AccountType,
AccountHierarchyL1,AccountHierarchyL2,AccountHierarchyL3,AccountHierarchyL4,
AccountHierarchyL5, BSReportHierarchyL1, BSReportHierarchyL2, BSReportHierarchyL3,
BSReportHierarchyL4, PLReportHierarchyL1, PLReportHierarchyL2, PLReportHierarchyL3
)
;
SELECT DISTINCT 'Month' AS Period ,
@MinDateD AS ReportStart ,
@MaxDateD AS ReportEnd ,
tMonthRollup.AccountId AS AccountId ,
tMonthRollup.AcctCategory AS AcctCategory ,
tMonthRollup.PeriodStartD AS PeriodStart ,
tMonthRollup.PeriodEndD AS PeriodEnd ,
AccountList.AccountName AS AccountName ,
AccountList.PeriodAmount AS PeriodAmount ,
AccountList.AccountType AS AccountType ,
AccountList.AccountHierarchyL1 AS AccountHierarchyL1 ,
AccountList.AccountHierarchyL2 AS AccountHierarchyL2 ,
AccountList.AccountHierarchyL3 AS AccountHierarchyL3
FROM #TAccountList AS AccountList
RIGHT OUTER JOIN #TMonthsRollup AS tMonthRollup
ON AccountList.AccountId = tMonthRollup.AccountId
AND Accountlist.AcctCategory = tMonthRollup.AcctCategory
AND AccountList.PeriodStartN = tMonthRollup.PeriodStartN
--------------------------------------------------------
PRINT IIF(@Debug = @True, 'END PROCEDURE - ' + CAST(SYSDATETIMEOFFSET() AS VARCHAR),NULL)
--------------------------------------------------------
IF @Debug = @True
BEGIN
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
IF @NoCountState <> 0
SET NOCOUNT ON
END
END
答案 0 :(得分:1)
这是一个编码版本,如何执行汇总并将其与日历表联系起来:
WITH cte_Rollup as
(SELECT TxnR.TxnTimePeriod TimePeriod,
TxnR.TxnPeriodStart Start,
TxnR.TxnPeriodEnd End,
TxnR.TxnCategory Category,
TxnR.TxnAccountName Account,
SUM(TxnRAmount) Amount
From TxnRollups TxnR
WHERE
TxnR.TxnTImePeriod = @SummaryPeriod AND
TxnR.TxnPeriodEnd BETWEEN @FromDate AND @ToDate
GROUP BY
TxnR.TxnTimePeriod,
TxnR.TxnPeriodStart,
TxnR.TxnPeriodEnd,
TxnR.TxnCategory,
TxnR.TxnAccountName
), cte_Calendar AS
(
SELECT cal.Period, Min(cal.PeriodDate) as PeriodStart, Max(cal.PeriodDate) as PeriodEnd
FROM calendar cal
WHERE cal.PeriodDate BETWEEN @FromDate AND @ToDate AND cal.Period = @SummaryPeriod
GROUP BY cal.TImePeriod,cal.MonthValue
)
SELECT *
FROM cte_Calendar
LEFT OUTER JOIN cte_Rollup ON cte_Calendar.PeriodStart = cte_Rollup.TxnPeriodStart
答案 1 :(得分:0)
我建议使用仅包含列的索引月份表:开始日期和结束日期。 然后在TxnRollups表上执行简单的左外连接。 将月份表(StartDate和EndDate)中的聚集索引添加到TxnPeriodStart和TxnPeriodEnd上的TxnRollups上的noclustered索引将有助于进一步加快速度。