SQLite - 帮助优化具有多个条件的先前行的总计

时间:2012-12-28 01:37:05

标签: sql sqlite aggregate-functions

我正在尝试获取表中每条记录的Value列的条件SUM,用于按相同“Category”字段值分组的所有“之前”记录,以及相同的“Approved”字段值,然后划分分为负数和正数。

在我的程序中,用户可以按任何顺序创建文档记录,因此“previous”定义为:

如果已批准 = TRUE,则“之前”记录的旧 ApprovedDate 字段值比当前记录更早。如果 ApprovedDate 字段值相同,则“之前”记录的 DocumentNumber 字段值较低。

如果已批准 = FALSE,则“之前”记录的 IssuedDate 字段值比当前记录旧。如果 IssuedDate 字段值相同,则“之前”记录的 DocumentNumber 字段值较低。

例如,在下表中:

CREATE TABLE Changes (GUID TEXT, Value REAL, DocumentNumber TEXT, Approved INTEGER, ApprovedDate TEXT, IssuedDate TEXT, Category TEXT);
INSERT INTO Changes (GUID, DocumentNumber, Approved, Value, ApprovedDate, IssuedDate, Category) VALUES ('4F7253A4E1B3D841B84D4A82B4F0E7A2', '11', 0, 18526.7, '', '2009-03-31T05:00:00Z', 'UNKNOWN');
INSERT INTO Changes (GUID, DocumentNumber, Approved, Value, ApprovedDate, IssuedDate, Category) VALUES ('D97537852E927B499C21C14F3D13CF06', '1', 0, 0, '', '2008-11-10T05:00:00Z', 'UNKNOWN');
INSERT INTO Changes (GUID, DocumentNumber, Approved, Value, ApprovedDate, IssuedDate, Category) VALUES ('857DADB463807345918729B33399B36F', '2', 0, 0, '', '2008-11-10T05:00:00Z', 'UNKNOWN');
INSERT INTO Changes (GUID, DocumentNumber, Approved, Value, ApprovedDate, IssuedDate, Category) VALUES ('7989D242E05AFF4FB5EE99114822BF80', '21', 0, 50112, '', '2009-07-22T05:00:00Z', 'UNKNOWN');
INSERT INTO Changes (GUID, DocumentNumber, Approved, Value, ApprovedDate, IssuedDate, Category) VALUES ('16A0AB27FD3A784D9E0A14406C7683E0', '3', 0, 0, '', '2009-01-15T05:00:00Z', 'UNKNOWN');
INSERT INTO Changes (GUID, DocumentNumber, Approved, Value, ApprovedDate, IssuedDate, Category) VALUES ('D3D7B1C306D38C438FC3DEDFCB57D411', '131', 0, 17204, '', '2010-12-14T05:00:00Z', 'UNKNOWN');
INSERT INTO Changes (GUID, DocumentNumber, Approved, Value, ApprovedDate, IssuedDate, Category) VALUES ('2C89D974DDF86743A0D7D62B385FBDEF', '147', 0, 0, '', '2010-12-01T05:00:00Z', 'UNKNOWN');
INSERT INTO Changes (GUID, DocumentNumber, Approved, Value, ApprovedDate, IssuedDate, Category) VALUES ('F371D4237C837D448824697EB0162905', '198', 0, 0, '', '2011-01-10T05:00:00Z', 'UNKNOWN');
INSERT INTO Changes (GUID, DocumentNumber, Approved, Value, ApprovedDate, IssuedDate, Category) VALUES ('433D64C871AE4E46A0E1BFCE2BB69BA7', '364', 0, 0, '', '2011-11-14T05:00:00Z', 'UNKNOWN');
INSERT INTO Changes (GUID, DocumentNumber, Approved, Value, ApprovedDate, IssuedDate, Category) VALUES ('808496DBDE76CB4F911396BB817724F3', '352', 0, 0, '', '2011-10-17T05:00:00Z', 'UNKNOWN');
INSERT INTO Changes (GUID, DocumentNumber, Approved, Value, ApprovedDate, IssuedDate, Category) VALUES ('9545DEF1666B5F4D8626F19F8E9E9333', '418', 0, 10948, '', '2012-03-07T22:19:18Z', 'UNKNOWN');
INSERT INTO Changes (GUID, DocumentNumber, Approved, Value, ApprovedDate, IssuedDate, Category) VALUES ('244D7D89B79E0F4E91100E4ADB300656', '439', 0, 50945, '', '2012-04-27T20:33:26Z', 'UNKNOWN');
INSERT INTO Changes (GUID, DocumentNumber, Approved, Value, ApprovedDate, IssuedDate, Category) VALUES ('115A427BBB1D2C43BA11D9E5875FAA2C', '465', 0, 480049, '', '2012-07-20T16:17:54Z', 'UNKNOWN');
INSERT INTO Changes (GUID, DocumentNumber, Approved, Value, ApprovedDate, IssuedDate, Category) VALUES ('3A2271EFCC767E4CA40017E68802F10C', '478', 0, 54298, '', '2012-08-01T17:26:38Z', 'UNKNOWN');
INSERT INTO Changes (GUID, DocumentNumber, Approved, Value, ApprovedDate, IssuedDate, Category) VALUES ('99D0EFC5A9F1AA498DB1A4CDF294129B', '490', 0, 11500, '', '2012-09-18T14:23:13Z', 'ALTER');
INSERT INTO Changes (GUID, DocumentNumber, Approved, Value, ApprovedDate, IssuedDate, Category) VALUES ('38B2E3A379C5084998E6A84D496AC555', '491', 0, 26088, '', '2012-09-25T06:00:00Z', 'ALTER');
INSERT INTO Changes (GUID, DocumentNumber, Approved, Value, ApprovedDate, IssuedDate, Category) VALUES ('8902831C8FAD4941841EE2847656BDAF', '494', 0, -825, '', '2012-10-16T14:20:06Z', 'ALTER');
INSERT INTO Changes (GUID, DocumentNumber, Approved, Value, ApprovedDate, IssuedDate, Category) VALUES ('7AFDB08A002AE54A8DE7699855AEBE30', '495', 0, 221, '', '2012-10-16T14:21:27Z', 'ALTER');
INSERT INTO Changes (GUID, DocumentNumber, Approved, Value, ApprovedDate, IssuedDate, Category) VALUES ('38A2CCEF5F0B294AA8B8752F461D121D', '496', 0, 0, '', '2012-12-24T01:11:15Z', 'ALTER');
INSERT INTO Changes (GUID, DocumentNumber, Approved, Value, ApprovedDate, IssuedDate, Category) VALUES ('24CCD5CE409E674593108CBD816DBCCE', '486', 1, -825, '2012-10-01T21:42:52Z', '2012-09-17T20:42:12Z', 'ALTER');
INSERT INTO Changes (GUID, DocumentNumber, Approved, Value, ApprovedDate, IssuedDate, Category) VALUES ('C7458704E36C8F448C1F3A485EB08304', '485', 1, 10000, '2012-10-01T21:25:56Z', '2012-09-11T21:29:44Z', 'ALTER');
INSERT INTO Changes (GUID, DocumentNumber, Approved, Value, ApprovedDate, IssuedDate, Category) VALUES ('B511953AE6FB6446A63AA83C159057BE', '487', 1, 82170, '2012-10-01T21:42:51Z', '2012-09-17T20:46:41Z', 'ALTER');
INSERT INTO Changes (GUID, DocumentNumber, Approved, Value, ApprovedDate, IssuedDate, Category) VALUES ('EC977BC304A971439D04BB9DF4D8188A', '488', 1, 15500, '2012-10-01T20:58:15Z', '2012-09-18T06:00:00Z', 'ALTER');
INSERT INTO Changes (GUID, DocumentNumber, Approved, Value, ApprovedDate, IssuedDate, Category) VALUES ('D9B1F0C0A8E490448697B783639E09E0', '489', 1, 11503, '2012-10-01T21:42:50Z', '2012-09-18T13:56:18Z', 'ALTER');
INSERT INTO Changes (GUID, DocumentNumber, Approved, Value, ApprovedDate, IssuedDate, Category) VALUES ('698BB6D65832D146A49727C717A591A1', '492', 1, 2787, '2012-10-01T21:10:06Z', '2012-09-25T15:55:02Z', 'ALTER');
INSERT INTO Changes (GUID, DocumentNumber, Approved, Value, ApprovedDate, IssuedDate, Category) VALUES ('155D4F2B1854B34FABCDE8CF20F1E44C', '493', 1, 12162, '2012-10-01T21:10:06Z', '2012-09-25T16:04:40Z', 'ALTER');
INSERT INTO Changes (GUID, DocumentNumber, Approved, Value, ApprovedDate, IssuedDate, Category) VALUES ('137C9BF2B1EFD34B8831ADA70C5F9431', '1', 1, 369543, '2011-12-08T13:41:04Z', '1899-12-30T05:00:00Z', 'DRAW');
INSERT INTO Changes (GUID, DocumentNumber, Approved, Value, ApprovedDate, IssuedDate, Category) VALUES ('7F29FC7114BD10468AE92A047345B5DB', '2', 1, 7258, '2011-12-08T13:41:04Z', '2011-10-20T05:00:00Z', 'DRAW');
INSERT INTO Changes (GUID, DocumentNumber, Approved, Value, ApprovedDate, IssuedDate, Category) VALUES ('6B66D8EAD88E6E4FA29401CD524B978A', '3', 1, 979321, '2011-12-08T13:41:04Z', '2011-11-08T05:00:00Z', 'DRAW');
INSERT INTO Changes (GUID, DocumentNumber, Approved, Value, ApprovedDate, IssuedDate, Category) VALUES ('7F393B712B213041A6DD211E04F6DCA6', '4', 1, 14998, '2012-04-20T15:16:21Z', '2012-04-18T21:07:07Z', 'DRAW');
INSERT INTO Changes (GUID, DocumentNumber, Approved, Value, ApprovedDate, IssuedDate, Category) VALUES ('2255F84E7C7DA04389765724872D6413', '5', 1, 58926, '2012-04-20T15:16:23Z', '2012-04-18T21:13:15Z', 'DRAW');
INSERT INTO Changes (GUID, DocumentNumber, Approved, Value, ApprovedDate, IssuedDate, Category) VALUES ('DB4A5588DEB9F34C868F7AD1CB13ACC3', '6', 1, 13232, '2012-04-20T15:16:05Z', '2012-04-18T21:17:00Z', 'DRAW');
INSERT INTO Changes (GUID, DocumentNumber, Approved, Value, ApprovedDate, IssuedDate, Category) VALUES ('B5231AE40F8E7D41BA0A4D09614CBDF9', '7', 1, 10176, '2012-04-20T15:16:25Z', '2012-04-18T21:19:41Z', 'DRAW');
INSERT INTO Changes (GUID, DocumentNumber, Approved, Value, ApprovedDate, IssuedDate, Category) VALUES ('2362D54FCC53E447AC7D8289EA89FD05', '8', 1, 17556, '2012-04-20T15:16:04Z', '2012-04-18T21:21:20Z', 'DRAW');
INSERT INTO Changes (GUID, DocumentNumber, Approved, Value, ApprovedDate, IssuedDate, Category) VALUES ('6ED4565CA041704B8D006EDA4A1E4CF9', '9', 1, 399639, '2012-05-30T16:32:43Z', '2012-05-17T06:00:00Z', 'DRAW');
INSERT INTO Changes (GUID, DocumentNumber, Approved, Value, ApprovedDate, IssuedDate, Category) VALUES ('B21BE07E3E42C2418C70AD17862D3AE1', '10', 1, 6231, '2012-08-16T16:55:00Z', '2012-08-02T16:02:03Z', 'DRAW');
INSERT INTO Changes (GUID, DocumentNumber, Approved, Value, ApprovedDate, IssuedDate, Category) VALUES ('8FD252A50137754A98698F93AC9B01A7', '11', 1, 629, '2012-08-16T16:54:58Z', '2012-08-02T16:07:57Z', 'DRAW');
INSERT INTO Changes (GUID, DocumentNumber, Approved, Value, ApprovedDate, IssuedDate, Category) VALUES ('1B9AFD2C20362F48A486E8A535B29AF5', '20', 1, -113810, '2011-12-13T17:15:53Z', '2010-02-10T05:00:00Z', 'UNKNOWN');

结果应该是:

[结果] [2]

|                             GUID |                             SORTID |          VALUE | POSITIVE_PREVIOUS_TOTAL | NEGATIVE_PREVIOUS_TOTAL |
----------------------------------------------------------------------------------------------------------------------------------------------
| 99D0EFC5A9F1AA498DB1A4CDF294129B |   ALTER_0_2012-09-18T14:23:13Z_490 |          11500 |                       0 |                       0 |
| 38B2E3A379C5084998E6A84D496AC555 |   ALTER_0_2012-09-25T06:00:00Z_491 |          26088 |                   11500 |                       0 |
| 8902831C8FAD4941841EE2847656BDAF |   ALTER_0_2012-10-16T14:20:06Z_494 |           -825 |                   37588 |                       0 |
| 7AFDB08A002AE54A8DE7699855AEBE30 |   ALTER_0_2012-10-16T14:21:27Z_495 |            221 |                   37588 |                    -825 |
| 38A2CCEF5F0B294AA8B8752F461D121D |   ALTER_0_2012-12-24T01:11:15Z_496 |              0 |                   37809 |                    -825 |
| EC977BC304A971439D04BB9DF4D8188A |   ALTER_1_2012-10-01T20:58:15Z_488 |          15500 |                   92170 |                    -825 |
| 698BB6D65832D146A49727C717A591A1 |   ALTER_1_2012-10-01T21:10:06Z_492 |           2787 |                  119173 |                    -825 |
| 155D4F2B1854B34FABCDE8CF20F1E44C |   ALTER_1_2012-10-01T21:10:06Z_493 |          12162 |                  121960 |                    -825 |
| C7458704E36C8F448C1F3A485EB08304 |   ALTER_1_2012-10-01T21:25:56Z_485 |          10000 |                       0 |                       0 |
| D9B1F0C0A8E490448697B783639E09E0 |   ALTER_1_2012-10-01T21:42:50Z_489 |          11503 |                  107670 |                    -825 |
| B511953AE6FB6446A63AA83C159057BE |   ALTER_1_2012-10-01T21:42:51Z_487 |          82170 |                   10000 |                    -825 |
| 24CCD5CE409E674593108CBD816DBCCE |   ALTER_1_2012-10-01T21:42:52Z_486 |           -825 |                   10000 |                       0 |
| 137C9BF2B1EFD34B8831ADA70C5F9431 |      DRAW_1_2011-12-08T13:41:04Z_1 |         369543 |                       0 |                       0 |
| 7F29FC7114BD10468AE92A047345B5DB |      DRAW_1_2011-12-08T13:41:04Z_2 |           7258 |                  369543 |                       0 |
| 6B66D8EAD88E6E4FA29401CD524B978A |      DRAW_1_2011-12-08T13:41:04Z_3 |         979321 |                  376801 |                       0 |
| 2362D54FCC53E447AC7D8289EA89FD05 |      DRAW_1_2012-04-20T15:16:04Z_8 |          17556 |                 1453454 |                       0 |
| DB4A5588DEB9F34C868F7AD1CB13ACC3 |      DRAW_1_2012-04-20T15:16:05Z_6 |          13232 |                 1430046 |                       0 |
| 7F393B712B213041A6DD211E04F6DCA6 |      DRAW_1_2012-04-20T15:16:21Z_4 |          14998 |                 1356122 |                       0 |
| 2255F84E7C7DA04389765724872D6413 |      DRAW_1_2012-04-20T15:16:23Z_5 |          58926 |                 1371120 |                       0 |
| B5231AE40F8E7D41BA0A4D09614CBDF9 |      DRAW_1_2012-04-20T15:16:25Z_7 |          10176 |                 1443278 |                       0 |
| 6ED4565CA041704B8D006EDA4A1E4CF9 |      DRAW_1_2012-05-30T16:32:43Z_9 |         399639 |                 1471010 |                       0 |
| 8FD252A50137754A98698F93AC9B01A7 |     DRAW_1_2012-08-16T16:54:58Z_11 |            629 |                 1876880 |                       0 |
| B21BE07E3E42C2418C70AD17862D3AE1 |     DRAW_1_2012-08-16T16:55:00Z_10 |           6231 |                 1870649 |                       0 |
| D97537852E927B499C21C14F3D13CF06 |   UNKNOWN_0_2008-11-10T05:00:00Z_1 |              0 |                       0 |                       0 |
| 857DADB463807345918729B33399B36F |   UNKNOWN_0_2008-11-10T05:00:00Z_2 |              0 |                       0 |                       0 |
| 16A0AB27FD3A784D9E0A14406C7683E0 |   UNKNOWN_0_2009-01-15T05:00:00Z_3 |              0 |                       0 |                       0 |
| 4F7253A4E1B3D841B84D4A82B4F0E7A2 |  UNKNOWN_0_2009-03-31T05:00:00Z_11 | 18526.69921875 |                       0 |                       0 |
| 7989D242E05AFF4FB5EE99114822BF80 |  UNKNOWN_0_2009-07-22T05:00:00Z_21 |          50112 |          18526.69921875 |                       0 |
| 2C89D974DDF86743A0D7D62B385FBDEF | UNKNOWN_0_2010-12-01T05:00:00Z_147 |              0 |          68638.69921875 |                       0 |
| D3D7B1C306D38C438FC3DEDFCB57D411 | UNKNOWN_0_2010-12-14T05:00:00Z_131 |          17204 |          68638.69921875 |                       0 |
| F371D4237C837D448824697EB0162905 | UNKNOWN_0_2011-01-10T05:00:00Z_198 |              0 |          85842.69921875 |                       0 |
| 808496DBDE76CB4F911396BB817724F3 | UNKNOWN_0_2011-10-17T05:00:00Z_352 |              0 |          85842.69921875 |                       0 |
| 433D64C871AE4E46A0E1BFCE2BB69BA7 | UNKNOWN_0_2011-11-14T05:00:00Z_364 |              0 |          85842.69921875 |                       0 |
| 9545DEF1666B5F4D8626F19F8E9E9333 | UNKNOWN_0_2012-03-07T22:19:18Z_418 |          10948 |          85842.69921875 |                       0 |
| 244D7D89B79E0F4E91100E4ADB300656 | UNKNOWN_0_2012-04-27T20:33:26Z_439 |          50945 |          96790.69921875 |                       0 |
| 115A427BBB1D2C43BA11D9E5875FAA2C | UNKNOWN_0_2012-07-20T16:17:54Z_465 |         480049 |         147735.69921875 |                       0 |
| 3A2271EFCC767E4CA40017E68802F10C | UNKNOWN_0_2012-08-01T17:26:38Z_478 |          54298 |         627784.69921875 |                       0 |
| 1B9AFD2C20362F48A486E8A535B29AF5 |  UNKNOWN_1_2011-12-13T17:15:53Z_20 |        -113810 |                       0 |                       0 |

基本上 - 所有记录按类别和已批准字段值分组,然后按批准日期,文档编号(如果批准= TRUE)或IssuedDate,DocumentNumber(如果批准= FALSE)排序。最后,逐个记录的运行总和将计算具有相同类别和批准值的每个先前记录的值,并在单独的列中报告负数和正数。

我可以通过子选择获得所需的结果,但性能很差(700-ish记录需要6.5秒)。我一直在尝试使用CASE WHEN语句结合各种GROUP BY和HAVING子句,但我似乎无法获得正确的值(我无法获得逻辑来确定“以前的”记录以满足我的要求) 。但是,使用这种方法可以获得非常好的性能(在大多数情况下,对于相同的数据集,性能不到200毫秒)。

这是我的尝试(仅限SQLite,下面是T-SQL),但需要很长时间:

SELECT a.GUID, 
SUM(CASE WHEN b.Value>0 THEN b.Value ELSE 0 END) as positive_previous_total, 
SUM(CASE WHEN b.Value<0 THEN b.Value ELSE 0 END)  as negative_previous_total 
FROM Changes AS a left join Changes as b 
ON b.rowid != a.rowid 
AND b.Approved =a.Approved 
AND b.Category=a.Category 
AND ((IFNULL(SUBSTR(CASE WHEN b.Approved THEN b.ApprovedDate ELSE b.IssuedDate END, 1, 10), '0000-00-00')<IFNULL(SUBSTR(CASE WHEN a.Approved THEN a.ApprovedDate ELSE a.IssuedDate END, 1, 10), '0000-00-00') 
OR ((IFNULL(SUBSTR(CASE WHEN b.Approved THEN b.ApprovedDate ELSE b.IssuedDate END, 1, 10), '0000-00-00')=IFNULL(SUBSTR(CASE WHEN a.Approved THEN a.ApprovedDate ELSE a.IssuedDate END, 1, 10), '0000-00-00') AND b.DocumentNumber<a.DocumentNumber)))) 
GROUP BY a.rowid

这是我的尝试(T-SQL)

   SELECT 
    a.[GUID], 
    [positive_previous_total] = SUM(CASE WHEN b.Value>0 THEN b.Value ELSE 0 END), 
    [negative_previous_total] = SUM(CASE WHEN b.Value<0 THEN b.Value ELSE 0 END) 
FROM 
    #Changes a 
    LEFT OUTER JOIN #Changes b 
        ON 
        b.[GUID]    <> a.[GUID] AND
        b.Approved  = a.Approved AND
        b.Category  = a.Category 
        AND 
        (
        ISNULL(SUBSTRING(CASE WHEN b.Approved=1 THEN b.ApprovedDate ELSE b.IssuedDate END, 1, 10), '0000-00-00') 
          < ISNULL(SUBSTRING(CASE WHEN a.Approved=1 THEN a.ApprovedDate ELSE a.IssuedDate END, 1, 10), '0000-00-00')
        OR 
            (
            ISNULL(SUBSTRING(CASE WHEN b.Approved=1 THEN b.ApprovedDate ELSE b.IssuedDate END, 1, 10), '0000-00-00')
              =ISNULL(SUBSTRING(CASE WHEN a.Approved=1 THEN a.ApprovedDate ELSE a.IssuedDate END, 1, 10), '0000-00-00') 
            AND 
            b.DocumentNumber<a.DocumentNumber
            )
        ) 
GROUP BY a.[GUID]

有没有人建议我更理想地获得所需的结果?

提前感谢您的帮助。

2 个答案:

答案 0 :(得分:2)

首先 - 感谢为什么thetheq和以下主题中的贡献者:Can this be simplified and therefore optimized值得一读,即使解决方案由于其局限性而不能全部适用于SQLite。我能够根据将SELECT时间缩短一半的建议实现一些优化。

如何完成:

  1. 已添加且有效日期列,其在Approved = TRUE时存储ApprovedDate的日期部分,或在Approved时为IssuedDate = FALSE为整数而不是ISO8601文本。

  2. 添加了一个名为EffectiveID的新列,它将Category,Approved,零填充的EffectiveDate和零填充的DocumentNumber存储为单个值,以便更快地进行比较。

  3. 更新Approved,IssuedDate和ApprovedDate列时创建更新触发器以更新EffectiveDate列。

  4. 在EFfectiveDate,Category,Approved和Number列上创建了一个UPDATE TRIGGER,以相应地更新EffectiveID字段。

  5. 在插入新行时创建了一个TRIGGER,以相应地更新EffectiveDate。

  6. 为EffectiveDate和EffectiveID列添加了索引

  7. 重新安排ON子句测试以减少匹配次数,减少处理

  8. 更改了JOIN以仅选择结果集中所需的列子集。

  9. 最终的SQL如下:

    SELECT 
    a.GUID, 
    TOTAL(CASE WHEN b.Value>0 THEN b.Value ELSE 0 END) as positive_previous_total, 
    TOTAL(CASE WHEN b.Value<0 THEN b.Value ELSE 0 END) as negative_previous_total 
    FROM 
    Changes AS a 
    LEFT JOIN 
    (SELECT rowid, Value, Category, Approved, EffectiveId FROM Changes) as b 
    ON 
    b.EffectiveID < a.EffectiveID 
    AND b.Category=a.Category
    AND b.Approved=a.Approved
    AND b.rowid != a.rowid
    GROUP BY 
    a.rowid
    

答案 1 :(得分:1)

请参阅建议HERE ON SO了解如何加快速度。

两位或三位SO重量级人物都做出了贡献,因此我会与您的DBA联系,看看可以做出哪些更改。

似乎共识是你需要在changes表上添加一些索引。

我也喜欢在这个表中添加几个计算字段,然后使后续脚本更具可读性。

如果您能够更改某些数据类型,那么答案中会有很多建议。