Oracle SQL总结值,直到达到另一个值

时间:2016-05-25 10:40:57

标签: sql oracle

我希望我能以一种可以理解的方式描述我的挑战。 我在Oracle Database 12c上有两个表,如下所示:

表名"发票"

I_ID | invoice_number |     creation_date     | i_amount
------------------------------------------------------
  1  |  10000000000   |  01.02.2016 00:00:00  |   30
  2  |  10000000001   |  01.03.2016 00:00:00  |   25
  3  |  10000000002   |  01.04.2016 00:00:00  |   13
  4  |  10000000003   |  01.05.2016 00:00:00  |   18
  5  |  10000000004   |  01.06.2016 00:00:00  |   12

表名"付款"

P_ID |   reference    |     received_date     | p_amount
------------------------------------------------------
  1  |  PAYMENT01     |  12.02.2016 13:14:12  |   12
  2  |  PAYMENT02     |  12.02.2016 15:24:21  |   28
  3  |  PAYMENT03     |  08.03.2016 23:12:00  |    2
  4  |  PAYMENT04     |  23.03.2016 12:32:13  |   30
  5  |  PAYMENT05     |  12.06.2016 00:00:00  |   15

所以我想要一个select语句(可能有oracle分析函数但我并不熟悉它),其中的付款总结到达到发票金额,按日期排序。如果两笔付款的总和超过发票金额,则最后一笔付款金额的其余部分应用于下一张发票。

在此示例中,结果应如下所示:

invoice_number | reference | used_pay_amount | open_inv_amount
----------------------------------------------------------
 10000000000   | PAYMENT01 |       12        |        18
 10000000000   | PAYMENT02 |       18        |         0
 10000000001   | PAYMENT02 |       10        |        15
 10000000001   | PAYMENT03 |        2        |        13
 10000000001   | PAYMENT04 |       13        |         0
 10000000002   | PAYMENT04 |       13        |         0
 10000000003   | PAYMENT04 |        4        |        14
 10000000003   | PAYMENT05 |       14        |         0
 10000000004   | PAYMENT05 |        1        |        11 

如果有一个"简单"的解决方案将会很好。选择陈述。

提前你的时间......

2 个答案:

答案 0 :(得分:8)

Oracle安装程序

CREATE TABLE invoices ( i_id, invoice_number, creation_date, i_amount ) AS
SELECT 1, 100000000, DATE '2016-01-01', 30 FROM DUAL UNION ALL
SELECT 2, 100000001, DATE '2016-02-01', 25 FROM DUAL UNION ALL
SELECT 3, 100000002, DATE '2016-03-01', 13 FROM DUAL UNION ALL
SELECT 4, 100000003, DATE '2016-04-01', 18 FROM DUAL UNION ALL
SELECT 5, 100000004, DATE '2016-05-01', 12 FROM DUAL;

CREATE TABLE payments ( p_id, reference, received_date, p_amount ) AS
SELECT 1, 'PAYMENT01', DATE '2016-01-12', 12 FROM DUAL UNION ALL
SELECT 2, 'PAYMENT02', DATE '2016-01-13', 28 FROM DUAL UNION ALL
SELECT 3, 'PAYMENT03', DATE '2016-02-08',  2 FROM DUAL UNION ALL
SELECT 4, 'PAYMENT04', DATE '2016-02-23', 30 FROM DUAL UNION ALL
SELECT 5, 'PAYMENT05', DATE '2016-05-12', 15 FROM DUAL;

<强>查询

WITH total_invoices ( i_id, invoice_number, creation_date, i_amount, i_total ) AS (
  SELECT i.*,
         SUM( i_amount ) OVER ( ORDER BY creation_date, i_id )
  FROM   invoices i
),
total_payments ( p_id, reference, received_date, p_amount, p_total ) AS (
  SELECT p.*,
         SUM( p_amount ) OVER ( ORDER BY received_date, p_id )
  FROM   payments p
)
SELECT invoice_number,
       reference,
       LEAST( p_total, i_total )
         - GREATEST( p_total - p_amount, i_total - i_amount ) AS used_pay_amount,
       GREATEST( i_total - p_total, 0 ) AS open_inv_amount
FROM   total_invoices
       INNER JOIN
       total_payments
       ON (    i_total - i_amount < p_total
           AND i_total > p_total - p_amount );

<强>解释

两个子查询因子分解(WITH ... AS ())子句只是在invoicespayments表中添加额外的虚拟列,其中包含发票/付款金额的累计总和。

您可以将范围与每张发票(或付款)关联,作为发票(付款)之前的欠款(已付)累计金额和之后的欠款(已付款)。然后可以在这些范围重叠的地方连接这两个表。

open_inv_amount是发票累计金额与已支付累计金额之间的正差额。

used_pay_amount稍微复杂一些,但您需要找出当前累计发票和付款总额中较低者与之前累计发票和付款总额中较高者之间的差异。

<强>输出

INVOICE_NUMBER REFERENCE USED_PAY_AMOUNT OPEN_INV_AMOUNT
-------------- --------- --------------- ---------------
     100000000 PAYMENT01              12              18
     100000000 PAYMENT02              18               0
     100000001 PAYMENT02              10              15
     100000001 PAYMENT03               2              13
     100000001 PAYMENT04              13               0
     100000002 PAYMENT04              13               0
     100000003 PAYMENT04               4              14
     100000003 PAYMENT05              14               0
     100000004 PAYMENT05               1              11

<强>更新

基于mathguy使用UNION加入数据的方法,我想出了一个不同的解决方案,重新使用了我的一些代码。

WITH combined ( invoice_number, reference, i_amt, i_total, p_amt, p_total, total ) AS (
  SELECT invoice_number,
         NULL,
         i_amount,
         SUM( i_amount ) OVER ( ORDER BY creation_date, i_id ),
         NULL,
         NULL,
         SUM( i_amount ) OVER ( ORDER BY creation_date, i_id )
  FROM   invoices
  UNION ALL
  SELECT NULL,
         reference,
         NULL,
         NULL,
         p_amount,
         SUM( p_amount ) OVER ( ORDER BY received_date, p_id ),
         SUM( p_amount ) OVER ( ORDER BY received_date, p_id )
  FROM   payments
  ORDER BY 7,
           2 NULLS LAST,
           1 NULLS LAST
),
filled ( invoice_number, reference, i_prev, i_total, p_prev, p_total ) AS (
  SELECT FIRST_VALUE( invoice_number )  IGNORE NULLS OVER ( ORDER BY ROWNUM ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING ),
         FIRST_VALUE( reference )       IGNORE NULLS OVER ( ORDER BY ROWNUM ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING ),
         FIRST_VALUE( i_total - i_amt ) IGNORE NULLS OVER ( ORDER BY ROWNUM ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING ),
         FIRST_VALUE( i_total )         IGNORE NULLS OVER ( ORDER BY ROWNUM ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING ),
         FIRST_VALUE( p_total - p_amt ) IGNORE NULLS OVER ( ORDER BY ROWNUM ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING ),
         COALESCE(
           p_total,
           LEAD( p_total ) IGNORE NULLS OVER ( ORDER BY ROWNUM ),
           LAG( p_total )  IGNORE NULLS OVER ( ORDER BY ROWNUM )
         )
   FROM  combined
),
vals ( invoice_number, reference, upa, oia, prev_invoice ) AS (
  SELECT invoice_number,
         reference,
         COALESCE( LEAST( p_total - i_total ) - GREATEST( p_prev, i_prev ), 0 ),
         GREATEST( i_total - p_total, 0 ),
         LAG( invoice_number ) OVER ( ORDER BY ROWNUM )
  FROM   filled 
)
SELECT invoice_number,
       reference,
       upa AS used_pay_amount,
       oia AS open_inv_amount
FROM   vals
WHERE  upa > 0
OR     ( reference IS NULL AND invoice_number <> prev_invoice AND oia > 0 );

<强>解释

combined子查询分解子句将两个表与UNION ALL连接起来,并生成开票和付款金额的累计总数。它做的最后一件事就是按行的递增累计总数对行进行排序(如果存在关联,它将按照创建的顺序将付款放在发票之前)。

filled子查询factoring子句将填充先前生成的表,以便如果值为null,则它将从下一个非空行获取值(如果有发票没有付款)然后它会从前面的行中找到以前付款的总数。

vals子查询分解子句应用与我之前的查询相同的计算(参见上文)。它还添加了prev_invoice列,以帮助识别完全无偿的发票。

最终SELECT获取值并过滤掉不必要的行。

答案 1 :(得分:1)

这是一个不需要连接的解决方案。如果数据量很大,这很重要。我使用Oracle 11.2的免费版(XE)对我的笔记本电脑进行了一些测试(没有商业用途)。使用MT0的解决方案,如果有10k发票和10k付款,则使用加入的查询大约需要11秒。对于50k发票和50k付款,查询花了287秒(差不多5分钟)。这是可以理解的,因为加入两个50k表需要25亿次比较。

下面的替代方案使用了一个联盟。它使用lag()last_value()来完成连接在其他解决方案中所做的工作。这种基于联合的解决方案,50k发票和50k付款,在我的笔记本电脑上花了不到0.5秒(!)

我简化了设置; i_idinvoice_numbercreation_date仅用于一个目的:订购发票金额。我仅为此目的使用inv_id(发票ID),付款方式类似..

出于测试目的,我创建了表格invoicespayments,如下所示:

create table invoices (inv_id, inv_amt) as 
   (select level, trunc(dbms_random.value(20, 80)) from dual connect by level <= 50000);
create table payments (pmt_id, pmt_amt) as 
   (select level, trunc(dbms_random.value(20, 80)) from dual connect by level <= 50000);

然后,为了测试解决方案,我使用查询来填充CTAS,如下所示:

create table bal_of_pmts as
   [select query, including the WITH clause but without the setup CTE's, comes here]

在我的解决方案中,我希望显示一张或多张发票的付款分配,以及一笔或多笔付款的发票付款;原帖中讨论的输出仅涵盖了这些信息的一半,但对于对称性,我更有意义地显示两半。输出(与原始帖子中的输入相同)看起来像这样,我的版本为inv_idpmt_id

    INV_ID       PAID     UNPAID     PMT_ID       USED  AVAILABLE
---------- ---------- ---------- ---------- ---------- ----------
         1         12         18        101         12          0
         1         18          0        103         18         10
         2         10         15        103         10          0
         2          2         13        105          2          0
         2         13          0        107         13         17
         3         13          0        107         13          4
         4          4         14        107          4          0
         4         14          0        109         14          1
         5          1         11        109          1          0
         5         11          0                    11

注意左半部分是原始帖子所要求的。最后还有一行。请注意支付ID为NULL,支付金额为11 - 显示最后一笔付款的剩余金额。如果有一个id = 6的发票,例如22,那么还会有一行 - 显示该发票的全部金额(22)为来自没有id的付款的“已付款” - 实际上意味着尚未涵盖(尚未)。

查询可能比连接方法更容易理解。要查看它的作用,可能有助于仔细查看中间结果,尤其是CTE c(在WITH子句中)。

with invoices (inv_id, inv_amt) as (
        select   1, 30 from dual union all
        select   2, 25 from dual union all
        select   3, 13 from dual union all
        select   4, 18 from dual union all
        select   5, 12 from dual
     ),
     payments (pmt_id, pmt_amt) as (
        select 101, 12 from dual union all
        select 103, 28 from dual union all
        select 105,  2 from dual union all
        select 107, 30 from dual union all
        select 109, 15 from dual
     ),
     c (kind, inv_id, inv_cml, pmt_id, pmt_cml, cml_amt) as (
        select 'i', inv_id, sum(inv_amt) over (order by inv_id), null, null, 
               sum(inv_amt) over (order by inv_id)
           from invoices
        union all
        select 'p', null, null, pmt_id, sum(pmt_amt) over (order by pmt_id),
               sum(pmt_amt) over (order by pmt_id)
           from payments
     ),
     d (inv_id, paid, unpaid, pmt_id, used, available) as (
        select last_value(inv_id) ignore nulls over (order by cml_amt desc),
               cml_amt - lead(cml_amt, 1, 0) over (order by cml_amt desc),
               case kind when 'i' then 0
                         else last_value(inv_cml) ignore nulls 
                                   over (order by cml_amt desc) - cml_amt end,
               last_value(pmt_id) ignore nulls over (order by cml_amt desc),
               cml_amt - lead(cml_amt, 1, 0) over (order by cml_amt desc),
               case kind when 'p' then 0 
                         else last_value(pmt_cml) ignore nulls
                                   over (order by cml_amt desc) - cml_amt end
        from c
     )
select   inv_id, paid, unpaid, pmt_id, used, available
from     d
where    paid != 0
order by inv_id, pmt_id
;

在大多数情况下,CTE d就是我们所需要的。但是,如果多个发票的累计金额完全等于多次付款的累计金额,我的查询会添加paid = unpaid = 0的行。(MT0的加入解决方案没有此问题。)为了涵盖所有可能的情况,而没有没有信息的行,我不得不为paid != 0添加过滤器。