合并两个带有一些业务逻辑的表以实现超量填充

时间:2018-11-01 15:26:22

标签: sql sql-server tsql

我需要写一份报告,面对我的挑战。

我有一个有选择的结果列表,该列表具有以下内容:

+---------+----------+----------+
| Header  | estimate | TargetId |
+---------+----------+----------+
| Task 1  |       80 |        1 |
| Task 2  |       30 |        1 |
| Task 3  |       40 |        2 |
| Task 4  |       10 |        2 |
+---------+----------+----------+

我想将其加入包含目标信息的另一组数据中:

+--------+----------+
| Target | Capacity |
+--------+----------+
|      1 |      100 |
|      2 |       50 |
|      3 |       50 |
+--------+----------+

但是我想做一些枢轴/交叉连接来填充每个目标的容量,并以某种方式报告此情况,以预测何时可以完成目标的每个任务。

+---------+----------+----------+----------+----------+---+---+
| Header  | Overfill | Target 1 | Target 2 | Target 3 | … | … |
+---------+----------+----------+----------+----------+---+---+
| Task 1  | No       |       80 |        0 |        0 | 0 | 0 |
| Task 2  | Yes      |       20 |       10 |        0 | 0 | 0 |
| Task 3  | No       |        0 |       40 |        0 | 0 | 0 |
| Task 4  | Yes      |        0 |        0 |       10 | 0 | 0 |
+---------+----------+----------+----------+----------+---+---+

替代显示:

+---------+--------+-----------+
| Header  | Target | Overfill% |
+---------+--------+-----------+
| Task 1  | 1      | 0         |
| Task 2  | 1,2    | 33.33     |
| Task 3  | 2      | 0         |
| Task 4  | 3      | 100%      |
+---------+--------+-----------+

实际数据集将涉及跨越20到30个目标的几百个任务,不幸的是,除了少数几个简单的选择,我没有任何代码可以演示,因为我不确定该如何处理溢出。

我相信可以通过C#轻松实现,但是我希望可以将其作为纯SP操作来完成,以便我可以按希望显示的方式返回数据。

在正确的方向采取任何帮助或轻推将不胜感激, 克里斯

1 个答案:

答案 0 :(得分:2)

在SQL中执行此操作不是一个好主意,但是使用递归CTE则可以。以下解决方案使用具有结果集的递归CTE,该结果集可保持解决方案的状态。它为每个递归迭代的每个源查询一条记录,并使用某些计算的结果更新状态。根据结果​​的状态,它可以推进序列,目标或两者。

此解决方案假定目标和标头是顺序排列的。如果没有按顺序排列目标,则可以使用CTE将ROW_NUMBER()添加到目标。另外,如果解决方案中的步骤数超过32767,它将失败,因为这是sql Server支持的最大递归。步骤应该最多是任务和目标。

一件好事是它将处理多个目标之间的溢出。例如,如果一个任务有一个估计值,它将填充多个目标,那么下一个任务将在下一个可用存储桶(而不是分配的存储桶)处开始。继续,在其中放一些疯狂的数字。

最后,我不知道您是如何得出溢出百分比的,我也不知道您是如何从样本数据中获得最后一行的结果的。我怀疑一旦知道标准就很难得出任何答案。

/** Setup Test Data **/
DECLARE @Tasks TABLE ( Header VARCHAR(20), Estimate INT, TargetId INT );
DECLARE @Targets TABLE ( TargetId INT, Capacity INT );

INSERT INTO @Tasks VALUES 
( 'Task 1', 80, 1 ), ( 'Task 2', 30, 1 ), ( 'Task 3', 40, 2 ), ( 'Task 4', 10, 2 );

INSERT INTO @Targets VALUES ( 1, 100 ), ( 2, 50 ), ( 3, 50 );

/** Solution **/

WITH Sequenced AS (
    -- Added SequenceId for tasks as it feels janky to order by headers.
    SELECT CAST(ROW_NUMBER() OVER (ORDER BY Header) AS INT) [SequenceId], tsk.*
    FROM @Tasks tsk
)
, TargetsWithOverflow AS (
    SELECT *
    FROM @Targets
    UNION
    SELECT MAX(TargetId) + 1, 99999999 -- overflow target to store excess not handled by targets
    FROM @Targets
)
, src AS (
    -- intialize state
    SELECT 0 [SequenceId], CAST('' AS varchar(20)) [Header], 0 [Estimate], 0 [CurrentTargetId]
        , 0 [CurrentTargetFillLevel], 0 [SequenceRemainingEstimate], 0 [OverfillAmt]

    UNION ALL

    SELECT seq.SequenceId, seq.header, seq.Estimate, tgt.TargetId
        , CASE WHEN [Excess] <= 0 THEN TrueFillLevel + TrueEstimate -- capacity meets estimate
            ELSE tgt.Capacity -- there is excess estimate
        END 
        , CASE WHEN [Excess] <= 0 THEN 0 -- task complete
            ELSE [Excess] -- task is not complete still some of estimate is left
        END
        , CASE WHEN tgt.TargetId != seq.TargetId THEN  
            CASE WHEN [Excess] > 0 THEN [TrueEstimate] - [Excess] ELSE [TrueEstimate] END
            ELSE 0 
        END
    FROM src
    INNER JOIN Sequenced seq ON 
        (src.SequenceRemainingEstimate = 0 AND seq.SequenceId = src.SequenceId + 1)
        OR (src.SequenceRemainingEstimate > 0 AND seq.SequenceId = src.SequenceId)
    INNER JOIN TargetsWithOverflow tgt ON 
        -- Part of target selection is based on if the sequence advanced.
        -- If the sequence has advanced then get the target assigned to the sequence 
        -- Or use the current one if it is GTE to the assigned target.
        -- Otherwise get the target after current target.
        (tgt.TargetId = seq.TargetId AND tgt.TargetId > src.CurrentTargetId AND seq.SequenceId != src.SequenceId)
        OR (tgt.TargetId = src.CurrentTargetId AND tgt.Capacity >= src.CurrentTargetFillLevel AND seq.SequenceId != src.SequenceId)
        OR (tgt.TargetId = src.CurrentTargetId + 1 AND seq.SequenceId = src.SequenceId)
    CROSS APPLY (
        SELECT CASE WHEN tgt.TargetId != src.CurrentTargetId THEN 0 ELSE src.CurrentTargetFillLevel END [TrueFillLevel] 
    ) forFillLevel
    CROSS APPLY (
        SELECT tgt.Capacity - [TrueFillLevel] [TrueCapacity]
    ) forCapacity
    CROSS APPLY (
        SELECT CASE WHEN src.SequenceRemainingEstimate > 0 THEN src.SequenceRemainingEstimate ELSE seq.Estimate END [TrueEstimate]
    ) forEstimate
    CROSS APPLY (
        SELECT TrueEstimate - TrueCapacity [Excess]
    ) forExcess
)
SELECT src.Header
    , LEFT(STUFF((SELECT ',' + RTRIM(srcIn.CurrentTargetId)
            FROM src srcIn
            WHERE srcIn.Header = src.Header
            ORDER BY srcIn.CurrentTargetId
            FOR XML PATH(''), TYPE).value('.', 'varchar(max)'), 1, 1, ''), 500)
    [Target] 
    , CASE WHEN SUM(OverfillAmt) > 0 THEN 'Yes' ELSE 'No' END [Overfill]
    , SUM (OverfillAmt) / (1.0 * AVG(seq.Estimate)) [OverfillPct]
FROM src
INNER JOIN Sequenced seq ON seq.SequenceId = src.SequenceId
WHERE src.SequenceId != 0
GROUP BY src.Header
OPTION (MAXRECURSION 32767)

输出

Header               Target     Overfill OverfillPct
-------------------- ---------- -------- ----------------
Task 1               1          No       0.00000000000000
Task 2               1,2        Yes      0.33333333333333
Task 3               2          No       0.00000000000000
Task 4               2,3        Yes      1.00000000000000

我只是重新阅读了您的问题,意识到您打算在存储过程中运行此查询。如果是这种情况,您可以使用此方法中的技术,并将其调整为使用游标的解决方案。我讨厌它们,但是我怀疑它的工作原理是否比此解决方案还差,并且没有递归限制。您只需将结果存储到临时表或表变量中,然后从中返回存储过程的结果。