具有分组依据的多个运行总计

时间:2012-04-28 23:02:17

标签: sql sql-server

我正在努力寻找一种运行总计的好方法,其中包含一个或多个等效组。以下基于光标的运行总计在一个完整的表上工作,但我想扩展它以添加“客户端”维度。因此,如下所示,我会在一个表格中为每个公司(即公司A,公司B,公司C等)创建总计,

CREATE TABLE test (tag int,  Checks float, AVG_COST float, Check_total float,  Check_amount float, Amount_total float, RunningTotal_Check float,  
 RunningTotal_Amount float)

DECLARE @tag int,
        @Checks float,
        @AVG_COST float,
        @check_total float,
        @Check_amount float,
        @amount_total float,
        @RunningTotal_Check float ,
        @RunningTotal_Check_PCT float,
        @RunningTotal_Amount float



SET @RunningTotal_Check = 0
SET @RunningTotal_Check_PCT = 0
SET @RunningTotal_Amount = 0
DECLARE aa_cursor CURSOR fast_forward
FOR
SELECT tag, Checks, AVG_COST, check_total, check_amount, amount_total
FROM test_3

OPEN aa_cursor
FETCH NEXT FROM aa_cursor INTO @tag,  @Checks, @AVG_COST, @check_total, @Check_amount, @amount_total
WHILE @@FETCH_STATUS = 0
 BEGIN
  SET @RunningTotal_CHeck = @RunningTotal_CHeck + @checks
  set @RunningTotal_Amount = @RunningTotal_Amount + @Check_amount
  INSERT test VALUES (@tag, @Checks, @AVG_COST, @check_total, @Check_amount, @amount_total,  @RunningTotal_check, @RunningTotal_Amount )
  FETCH NEXT FROM aa_cursor INTO @tag, @Checks, @AVG_COST, @check_total, @Check_amount, @amount_total
 END

CLOSE aa_cursor
DEALLOCATE aa_cursor

SELECT *, RunningTotal_Check/Check_total as CHECK_RUN_PCT, round((RunningTotal_Check/Check_total *100),0) as CHECK_PCT_BIN,  RunningTotal_Amount/Amount_total as Amount_RUN_PCT,  round((RunningTotal_Amount/Amount_total * 100),0) as Amount_PCT_BIN
into test_4
FROM test ORDER BY tag
create clustered index IX_TESTsdsdds3 on test_4(tag)

DROP TABLE test

----------------------------------

我可以为任何一家公司提供运行总额,但我想为多个公司做这样的事情,以产生类似下面的结果。

CLIENT  COUNT   Running Total
Company A   1   6.7%
Company A   2   20.0%
Company A   3   40.0%
Company A   4   66.7%
Company A   5   100.0%
Company B   1   3.6%
Company B   2   10.7%
Company B   3   21.4%
Company B   4   35.7%
Company B   5   53.6%
Company B   6   75.0%
Company B   7   100.0%
Company C   1   3.6%
Company C   2   10.7%
Company C   3   21.4%
Company C   4   35.7%
Company C   5   53.6%
Company C   6   75.0%
Company C   7   100.0%

3 个答案:

答案 0 :(得分:5)

这在SQL Server 2012中最简单,其中SUM和COUNT支持包含ORDER BY的OVER子句。使用Cris的#Checks表定义:

SELECT
  CompanyID,
  count(*) over (
    partition by CompanyID
    order by Cleared, ID
  ) as cnt,
  str(100.0*sum(Amount) over (
    partition by CompanyID
    order by Cleared, ID
  )/
  sum(Amount) over (
    partition by CompanyID
  ),5,1)+'%' as RunningTotalForThisCompany
FROM #Checks;

SQL小提琴here

答案 1 :(得分:5)

我最初开始发布SQL Server 2012的等价物(因为你没有提到你正在使用的版本)。 Steve在最新版本的SQL Server中展示了这种计算的简单性,所以我将专注于一些适用于早期版本SQL Server的方法(回到2005年)。

我将对您的架构采取一些自由,因为我无法弄清楚所有这些#test和#test_3和#test_4临时表应该代表什么。怎么样:

USE tempdb;
GO

CREATE TABLE dbo.Checks
(
  Client VARCHAR(32),
  CheckDate DATETIME,
  Amount DECIMAL(12,2)
);

INSERT dbo.Checks(Client, CheckDate, Amount)
          SELECT 'Company A', '20120101', 50
UNION ALL SELECT 'Company A', '20120102', 75
UNION ALL SELECT 'Company A', '20120103', 120
UNION ALL SELECT 'Company A', '20120104', 40
UNION ALL SELECT 'Company B', '20120101', 75
UNION ALL SELECT 'Company B', '20120105', 200
UNION ALL SELECT 'Company B', '20120107', 90;

在这种情况下的预期输出:

Client    Count Running Total
--------- ----- -------------
Company A 1     17.54
Company A 2     43.86
Company A 3     85.96
Company A 4     100.00
Company B 1     20.55
Company B 2     75.34
Company B 3     100.00

一种方式:

;WITH gt(Client, Totals) AS 
(
  SELECT Client, SUM(Amount) 
    FROM dbo.Checks AS c
    GROUP BY Client
), n (Client, Amount, rn) AS
(
  SELECT c.Client, c.Amount, 
    ROW_NUMBER() OVER  (PARTITION BY c.Client ORDER BY c.CheckDate)
    FROM dbo.Checks AS c
)
SELECT n.Client, [Count] = n.rn, 
  [Running Total] = CONVERT(DECIMAL(5,2), 100.0*(
    SELECT SUM(Amount) FROM n AS n2 
    WHERE Client = n.Client AND rn <= n.rn)/gt.Totals
 )
 FROM n INNER JOIN gt ON n.Client = gt.Client
 ORDER BY n.Client, n.rn;

稍微快一点的选择 - 更多阅读但更短的持续时间和更简单的计划:

;WITH x(Client, CheckDate, rn, rt, gt) AS 
(
   SELECT Client, CheckDate, rn = ROW_NUMBER() OVER
   (PARTITION BY Client ORDER BY CheckDate),
    (SELECT SUM(Amount) FROM dbo.Checks WHERE Client = c.Client 
      AND CheckDate <= c.CheckDate),
    (SELECT SUM(Amount) FROM dbo.Checks WHERE Client = c.Client)
FROM dbo.Checks AS c
)
SELECT Client, [Count] = rn, 
  [Running Total] = CONVERT(DECIMAL(5,2), rt * 100.0/gt)
  FROM x
  ORDER BY Client, [Count];

虽然我在这里提供了基于集合的替代方案,但根据我的经验,我发现游标通常是执行运行总计的最快支持的方式。还有其他方法,例如古怪的更新,其执行​​速度略快,但结果无法保证。随着源行数量的增加,执行自联接的基于集合的方法变得越来越昂贵 - 所以在使用小表进行测试时似乎表现良好,因为表变大,性能下降。 / p>

我有一篇几乎完全准备好的博客文章,对各种运行总计方法进行了稍微简单的性能比较。它更简单,因为它没有分组,只显示总数,而不是运行总百分比。我希望尽快发布这篇文章并尝试记住更新这个空间。

还有另一种选择可以考虑,不需要多次读取前一行。这是Hugo Kornelis描述的“基于集合的迭代”的概念。我不记得我第一次学习这种技术的地方,但在某些情况下它很有意义。

DECLARE @c TABLE
(
 Client VARCHAR(32), 
 CheckDate DATETIME,
 Amount DECIMAL(12,2),
 rn INT,
 rt DECIMAL(15,2)
);

INSERT @c SELECT Client, CheckDate, Amount,
  ROW_NUMBER() OVER (PARTITION BY Client
 ORDER BY CheckDate), 0
 FROM dbo.Checks;

DECLARE @i INT, @m INT;
SELECT @i = 2, @m = MAX(rn) FROM @c;

UPDATE @c SET rt = Amount WHERE rn = 1;

WHILE @i <= @m
BEGIN
    UPDATE c SET c.rt = c2.rt + c.Amount
      FROM @c AS c
      INNER JOIN @c AS c2
      ON c.rn = c2.rn + 1
      AND c.Client = c2.Client
      WHERE c.rn = @i;

    SET @i = @i + 1;
END

SELECT Client, [Count] = rn, [Running Total] = CONVERT(
  DECIMAL(5,2), rt*100.0 / (SELECT TOP 1 rt FROM @c
 WHERE Client = c.Client ORDER BY rn DESC)) FROM @c AS c;

虽然这确实执行了循环,并且每个人都告诉你循环和游标是坏的,但使用此方法的一个好处是,一旦计算了前一行的运行总计,我们只需要查看前一行而不是求和所有先前的行。另一个好处是,在大多数基于游标的解决方案中,您必须通过每个客户端然后进行每次检查。在这种情况下,您将完成所有客户的第一次检查,然后进行所有客户的第二次检查。因此,代替(客户端计数*平均检查计数)迭代,我们只进行(最大检查计数)迭代。对于简单的运行总计示例,此解决方案没有多大意义,但对于分组的运行总计示例,应根据上面的基于集合的解决方案对其进行测试。但是,如果您使用的是SQL Server 2012,那么它不可能击败史蒂夫的方法。

<强>更新

我在这里写了关于各种运行总计方法的博客:

http://www.sqlperformance.com/2012/07/t-sql-queries/running-totals

答案 2 :(得分:0)

我并不完全理解您所使用的架构,但这里是一个使用临时表的快速查询,该表显示了如何在基于集合的操作中执行运行总计。

CREATE TABLE #Checks
(
     ID int IDENTITY(1,1) PRIMARY KEY
    ,CompanyID int NOT NULL
    ,Amount float NOT NULL
    ,Cleared datetime NOT NULL
)

INSERT INTO #Checks
VALUES
     (1,5,'4/1/12')
    ,(1,5,'4/2/12')
    ,(1,7,'4/5/12')
    ,(2,10,'4/3/12')

SELECT Info.ID, Info.CompanyID, Info.Amount, RunningTotal.Total, Info.Cleared
FROM
(
SELECT main.ID, SUM(other.Amount) as Total
FROM
    #Checks main
JOIN
    #Checks other
ON
    main.CompanyID = other.CompanyID
AND
    main.Cleared >= other.Cleared
GROUP BY
    main.ID) RunningTotal
JOIN
    #Checks Info
ON
    RunningTotal.ID = Info.ID

DROP TABLE #Checks