选择累计总和小于数字的位置(按优先级顺序)

时间:2019-02-08 19:43:10

标签: sql oracle oracle12c running-total cumulative-sum

我有一个包含idcostpriority列的表:

create table a_test_table (id number(4,0), cost number(15,2), priority number(4,0));

insert into a_test_table (id, cost, priority) values (1, 1000000, 10);
insert into a_test_table (id, cost, priority) values (2, 10000000, 9);
insert into a_test_table (id, cost, priority) values (3, 5000000, 8);
insert into a_test_table (id, cost, priority) values (4, 19000000, 7);
insert into a_test_table (id, cost, priority) values (5, 20000000, 6);
insert into a_test_table (id, cost, priority) values (6, 15000000, 5);
insert into a_test_table (id, cost, priority) values (7, 2000000, 4);
insert into a_test_table (id, cost, priority) values (8, 3000000, 3);
insert into a_test_table (id, cost, priority) values (9, 3000000, 2);
insert into a_test_table (id, cost, priority) values (10, 8000000, 1);
commit;

select 
    id,
    to_char(cost, '$999,999,999') as cost,
    priority
from 
    a_test_table;
        ID COST            PRIORITY
---------- ------------- ----------
         1    $1,000,000         10
         2   $10,000,000          9
         3    $5,000,000          8
         4   $19,000,000          7
         5   $20,000,000          6
         6   $15,000,000          5
         7    $2,000,000          4
         8    $3,000,000          3
         9    $3,000,000          2
        10    $8,000,000          1

从最高优先级(降序)开始,我想选择cost加起来少于(或等于)$ 20,000,000的行。

结果如下:

       ID COST            PRIORITY
---------- ------------- ----------
         1    $1,000,000         10
         2   $10,000,000          9
         3    $5,000,000          8
         7    $2,000,000          4

      Total: $18,000,000

如何使用Oracle SQL?

3 个答案:

答案 0 :(得分:8)

这是在纯SQL中执行此操作的方法。我不会发誓没有更好的方法。

基本上,它使用递归公用表表达式(即WITH costed...)来 计算所有可能少于2000万的元素组合。

然后从该结果中获取第一个完整路径。

然后,它获取该路径中的所有行。

注意:逻辑假定id的长度不超过5位数字。这就是LPAD(id,5,'0')的东西。

WITH costed (id, cost, priority, running_cost, path) as 
( SELECT id, cost, priority, cost running_cost, lpad(id,5,'0') path
  FROM   a_test_table
  WHERE  cost <= 20000000
  UNION ALL 
  SELECT a.id, a.cost, a.priority, a.cost + costed.running_Cost, costed.path || '|' || lpad(a.id,5,'0')
  FROM   costed, a_test_table a 
  WHERE  a.priority < costed.priority
  AND    a.cost + costed.running_cost <= 20000000),
best_path as (  
SELECT *
FROM   costed c 
where not exists ( SELECT 'longer path' FROM costed c2 WHERE c2.path like c.path || '|%' )
order by path
fetch first 1 row only )
SELECT att.* 
FROM best_path cross join a_test_table att
WHERE best_path.path like '%' || lpad(att.id,5,'0') || '%'
order by att.priority desc;
+----+----------+----------+
| ID |   COST   | PRIORITY |
+----+----------+----------+
|  1 |  1000000 |       10 |
|  2 | 10000000 |        9 |
|  3 |  5000000 |        8 |
|  7 |  2000000 |        4 |
+----+----------+----------+

更新-较短的版本

此版本使用MATCH_RECOGNIZE在递归CTE之后找到最佳组中的所有行:

WITH costed (id, cost, priority, running_cost, path) as 
( SELECT id, cost, priority, cost running_cost, lpad(id,5,'0') path
  FROM   a_test_table
  WHERE  cost <= 20000000
  UNION ALL 
  SELECT a.id, a.cost, a.priority, a.cost + costed.running_Cost, costed.path || '|' || lpad(a.id,5,'0')
  FROM   costed, a_test_table a 
  WHERE  a.priority < costed.priority
  AND    a.cost + costed.running_cost <= 20000000)
  search depth first by priority desc set ord
SELECT id, cost, priority
FROM   costed c 
MATCH_RECOGNIZE (
  ORDER BY path
  MEASURES
    MATCH_NUMBER() AS mno
  ALL ROWS PER MATCH
  PATTERN (STRT ADDON*)
  DEFINE
    ADDON AS ADDON.PATH = PREV(ADDON.PATH) || '|' || LPAD(ADDON.ID,5,'0')
    )
WHERE mno = 1
ORDER BY priority DESC;

更新-甚至更短的版本,使用OP发布的SQL * Server链接中的巧妙构想

*编辑:在递归CTE的锚点部分删除了ROWNUM=1的使用,因为它取决于返回行的任意顺序。我很惊讶没有人喜欢我。 *

WITH costed (id, cost, priority, running_cost) as 
( SELECT id, cost, priority, cost running_cost
  FROM   ( SELECT * FROM a_test_table
           WHERE  cost <= 20000000
           ORDER BY priority desc
           FETCH FIRST 1 ROW ONLY )
  UNION ALL 
  SELECT a.id, a.cost, a.priority, a.cost + costed.running_Cost
  FROM   costed CROSS APPLY ( SELECT b.*
                              FROM   a_test_table b 
                              WHERE  b.priority < costed.priority
                              AND    b.cost + costed.running_cost <= 20000000
                              FETCH FIRST 1 ROW ONLY
                              ) a
)
CYCLE id SET is_cycle TO 'Y' DEFAULT 'N'
select id, cost, priority from costed
order by priority desc

答案 1 :(得分:3)

我太愚蠢,无法在普通SQL中执行此操作,因此我尝试使用PL / SQL-一个返回的函数。方法如下:遍历表中的所有行,计算总和;如果它小于限制,则很好-将行的ID添加到数组中并继续。

SQL> create or replace function f_pri (par_limit in number)
  2    return sys.odcinumberlist
  3  is
  4    l_sum   number := 0;
  5    l_arr   sys.odcinumberlist := sys.odcinumberlist();
  6  begin
  7    for cur_r in (select id, cost, priority
  8                  from a_test_table
  9                  order by priority desc
 10                 )
 11    loop
 12      l_sum := l_sum + cur_r.cost;
 13      if l_sum <= par_limit then
 14         l_arr.extend;
 15         l_arr(l_arr.last) := cur_r.id;
 16      else
 17         l_sum := l_sum - cur_r.cost;
 18      end if;
 19    end loop;
 20    return (l_arr);
 21  end;
 22  /

Function created.

准备SQL * Plus环境,以使输出看起来更漂亮:

SQL> break on report
SQL> compute sum of cost on report
SQL> set ver off

测试:

SQL> select t.id, t.cost, t.priority
  2  from table(f_pri(&par_limit)) x join a_test_table t on t.id = x.column_value
  3  order by t.priority desc;
Enter value for par_limit: 20000000

        ID       COST   PRIORITY
---------- ---------- ----------
         1    1000000         10
         2   10000000          9
         3    5000000          8
         7    2000000          4
           ----------
sum          18000000

SQL> /
Enter value for par_limit: 30000000

        ID       COST   PRIORITY
---------- ---------- ----------
         1    1000000         10
         2   10000000          9
         3    5000000          8
         7    2000000          4
         8    3000000          3
         9    3000000          2
           ----------
sum          24000000

6 rows selected.

SQL>

答案 2 :(得分:3)

DBA-SE聊天室上的

@ypercubeᵀᴹ发布于this solution。非常简洁。

with  rt (id, cost, running_total, priority) as
(
    (
    select 
        id,
        cost,
        cost as running_total,
        priority
    from 
        a_test_table
    where cost <= 20000000 
    order by priority desc
    fetch first 1 rows only
    )

    union all

        select 
            t.id,
            t.cost,
            t.cost + rt.running_total,
            t.priority
        from a_test_table  t
             join rt 
             on t.priority < rt.priority      -- redundant but throws
                                              -- "cycle detected error" if omitted

             and t.priority =                             -- needed 
                 ( select max(tm.priority) from a_test_table tm
                   where tm.priority < rt.priority
                     and tm.cost + rt.running_total <= 20000000 )
    )
    select *
    from rt ;

(@ypercubeᵀᴹ对自己发布它不感兴趣。)