Oracle SQL / PLSQL:具有重复数据的分层递归查询

时间:2019-03-27 20:48:11

标签: sql oracle recursion plsql hierarchical-data

下面我有一个递归函数,该函数很好用,但是现在我发现某些数据不是唯一的,我需要一种处理它的方法。

FUNCTION calc_cost (model_no_         NUMBER,
                    revision_         NUMBER,
                    sequence_no_   IN NUMBER,
                    currency_      IN VARCHAR2)
    RETURN NUMBER
IS
    qty_    NUMBER := 0;
    cost_   NUMBER := 0;
BEGIN
    SELECT NVL (new_qty, qty), purch_cost
      INTO qty_, cost_
      FROM prod_conf_cost_struct_clv
     WHERE model_no = model_no_
       AND revision = revision_
       AND sequence_no = sequence_no_
       AND (purch_curr = currency_
         OR purch_curr IS NULL);

    IF cost_ IS NULL
    THEN
        SELECT SUM (calc_cost (model_no,
                               revision,
                               sequence_no,
                               purch_curr))
          INTO cost_
          FROM prod_conf_cost_struct_clv
         WHERE model_no = model_no_
           AND revision = revision_
           AND (purch_curr = currency_
             OR purch_curr IS NULL)
           AND part_no IN (SELECT component_part
                             FROM prod_conf_cost_struct_clv
                            WHERE model_no = model_no_
                              AND revision = revision_
                              AND sequence_no = sequence_no_);
    END IF;

    RETURN qty_ * cost_;
EXCEPTION
    WHEN NO_DATA_FOUND
    THEN
        RETURN 0;
END calc_cost;

以下条件是此功能失败的地方...part_no in (select component_part...

样本数据:

rownum., model_no, revision, sequence_no, part_no, component_part, level, cost, purch_curr, qty

 1. 62, 1, 00, XXX, ABC, 1, null, null, 1
 2. 62, 1, 10, ABC, 123, 2, null, null, 1
 3. 62, 1, 20, 123, DEF, 3, null, null, 1
 4. 62, 1, 30, DEF, 456, 4, 100, GBP, 1
 5. 62, 1, 40, DEF, 789, 4, 50, GBP, 1
 6. 62, 1, 50, DEF, 024, 4, 20, GBP, 1
 7. 62, 1, 60, ABC, 356, 2, null, null, 2
 8. 62, 1, 70, 356, DEF, 3, null, null, 3
 9. 62, 1, 80, DEF, 456, 4, 100, GBP, 1
 10. 62, 1, 90, DEF, 789, 4, 50, EUR, 1
 11. 62, 1, 100, DEF, 024, 4, 20, GBP, 1

如果我要将以下值传递到函数参数中:model_no,version,sequence_no(忽略货币,因为它与问题无关)

62, 1, 20

我希望它仅汇总第4-6行= 170,但是它汇总了第4-6行和9-11 = 340。

此函数最终将在下面的SQL查询中使用:

    SELECT LEVEL,
           SYS_CONNECT_BY_PATH (sequence_no, '->') PATH,
           calc_cost (model_no,
                      revision,
                      sequence_no,
                      'GBP')
               total_gbp
      FROM prod_conf_cost_struct_clv
     WHERE model_no = 62
       AND revision = 1
CONNECT BY PRIOR component_part = part_no
       AND PRIOR model_no = 62
       AND PRIOR revision = 1
START WITH sequence_no = 20
  ORDER BY sequence_no

如您所见,这还会引入component_part = part_no的问题。

更新

根据提供的答案,我想我将扩展原始问题,以便也处理货币和数量元素。我已经更新了样本数据,以包括货币和数量。

如果我要将以下值传递给函数参数:model_no,version,sequence_no,currency:

Input: 62, 1, 70, EUR 
Expected Cost Output: 150

Input: 62, 1, 60, EUR 
Expected Cost Output: 300

Input: 62, 1, 60, GBP
Expected Cost Output: 720

我们将不胜感激。

谢谢。

3 个答案:

答案 0 :(得分:2)

注意:如果您在运行MATCH_RECOGNIZE时遇到问题,可能是因为您正在运行(不是太旧)的SQL * Developer版本。尝试使用最新版本,或改用SQL * Navigator,TOAD或SQL * Plus。问题是“?”字符,这使SQL * Developer感到困惑,因为这是JDBC用于绑定变量的字符。

您遇到了数据模型问题。即,您的prod_conf_cost_struct_cvl表中的子记录未明确地链接到其父行。这就是“ DEF”子装配引起问题的原因。没有明确的链接,就无法干净地计算数据。

您应该更正此数据模型,并向每个记录添加一个parent_sequence_no,以便(例如)您可以知道sequence_no 80是sequence_no 70的子级,而不是sequence_no 20的孩子。

但是,由于我不能认为您有时间或权限来更改数据模型,因此我将按原样回答数据模型的问题。

首先,让我们将QTYPURCH_CURR添加到示例数据中。

with prod_conf_cost_struct_clv ( model_no, revision, sequence_no, part_no, component_part, lvl, cost, qty, purch_curr ) as
( 
SELECT 62, 1, 00, 'XXX', 'ABC', 1, null, 1, 'GBP' FROM DUAL UNION ALL
SELECT 62, 1, 10, 'ABC', '123', 2, null, 1, 'GBP' FROM DUAL UNION ALL
SELECT 62, 1, 20, '123', 'DEF', 3, null, 1, 'GBP' FROM DUAL UNION ALL
SELECT 62, 1, 30, 'DEF', '456', 4, 100, 1, 'GBP' FROM DUAL UNION ALL
SELECT 62, 1, 40, 'DEF', '789', 4, 50, 1, 'GBP' FROM DUAL UNION ALL
SELECT 62, 1, 50, 'DEF', '024', 4, 20, 1, 'GBP' FROM DUAL UNION ALL
SELECT 62, 1, 60, 'ABC', '356', 2, null, 1, 'GBP' FROM DUAL UNION ALL
SELECT 62, 1, 70, '356', 'DEF', 3, null, 1, 'GBP' FROM DUAL UNION ALL
SELECT 62, 1, 80, 'DEF', '456', 4, 100, 1, 'GBP' FROM DUAL UNION ALL
SELECT 62, 1, 90, 'DEF', '789', 4, 50, 1, 'GBP' FROM DUAL UNION ALL
SELECT 62, 1, 100, 'DEF', '024', 4, 20, 1, 'GBP' FROM DUAL )
select * from prod_conf_cost_struct_clv;
+----------+----------+-------------+---------+----------------+-----+------+-----+------------+
| MODEL_NO | REVISION | SEQUENCE_NO | PART_NO | COMPONENT_PART | LVL | COST | QTY | PURCH_CURR |
+----------+----------+-------------+---------+----------------+-----+------+-----+------------+
|       62 |        1 |           0 | XXX     | ABC            |   1 |      |   1 | GBP        |
|       62 |        1 |          10 | ABC     | 123            |   2 |      |   1 | GBP        |
|       62 |        1 |          20 | 123     | DEF            |   3 |      |   1 | GBP        |
|       62 |        1 |          30 | DEF     | 456            |   4 |  100 |   1 | GBP        |
|       62 |        1 |          40 | DEF     | 789            |   4 |   50 |   1 | GBP        |
|       62 |        1 |          50 | DEF     | 024            |   4 |   20 |   1 | GBP        |
|       62 |        1 |          60 | ABC     | 356            |   2 |      |   1 | GBP        |
|       62 |        1 |          70 | 356     | DEF            |   3 |      |   1 | GBP        |
|       62 |        1 |          80 | DEF     | 456            |   4 |  100 |   1 | GBP        |
|       62 |        1 |          90 | DEF     | 789            |   4 |   50 |   1 | GBP        |
|       62 |        1 |         100 | DEF     | 024            |   4 |   20 |   1 | GBP        |
+----------+----------+-------------+---------+----------------+-----+------+-----+------------+

注意:您没有显示测试数据中如何表示多种货币,因此我在此答案中对该问题的处理可能不正确。

好的,所以我们要做的第一件事就是弄清楚parent_sequence_no的值(它确实应该在您的表中-参见上文)。由于它不在您的表中,因此我们需要对其进行计算。我们将其计算为具有最高sequence_no的行sequence_no,该行的level小于当前行,并且具有lvl(为了避免使用Oracle关键字),比当前行少一。

要有效地找到此值,我们可以使用MATCH_RECOGNIZE功能来描述每个孩子的父行应该是什么样。

我们将使用新的parent_sequence_nocorrected_hierarchy调用结果集。

with prod_conf_cost_struct_clv ( model_no, revision, sequence_no, part_no, component_part, lvl, cost, qty, purch_curr ) as
( 
SELECT 62, 1, 00, 'XXX', 'ABC', 1, null, 1, 'GBP' FROM DUAL UNION ALL
SELECT 62, 1, 10, 'ABC', '123', 2, null, 1, 'GBP' FROM DUAL UNION ALL
SELECT 62, 1, 20, '123', 'DEF', 3, null, 1, 'GBP' FROM DUAL UNION ALL
SELECT 62, 1, 30, 'DEF', '456', 4, 100, 1, 'GBP' FROM DUAL UNION ALL
SELECT 62, 1, 40, 'DEF', '789', 4, 50, 1, 'GBP' FROM DUAL UNION ALL
SELECT 62, 1, 50, 'DEF', '024', 4, 20, 1, 'GBP' FROM DUAL UNION ALL
SELECT 62, 1, 60, 'ABC', '356', 2, null, 1, 'GBP' FROM DUAL UNION ALL
SELECT 62, 1, 70, '356', 'DEF', 3, null, 1, 'GBP' FROM DUAL UNION ALL
SELECT 62, 1, 80, 'DEF', '456', 4, 100, 1, 'GBP' FROM DUAL UNION ALL
SELECT 62, 1, 90, 'DEF', '789', 4, 50, 1, 'GBP' FROM DUAL UNION ALL
SELECT 62, 1, 100, 'DEF', '024', 4, 20, 1, 'GBP' FROM DUAL )
-- Step 1: correct for your data model problem, which is the fact that child rows
-- (e.g., operations 30-50) are not *explicitly* linked to their parent rows (e.g.,
-- operation 20)
, corrected_hierarchy ( model_no, revision, parent_sequence_no, sequence_no, part_no, component_part, lvl, cost, qty, purch_curr ) AS
(
SELECT *
FROM   prod_conf_cost_struct_clv c
MATCH_RECOGNIZE (
  PARTITION BY model_no, revision
  ORDER BY sequence_no desc
  MEASURES (P.sequence_no) AS parent_sequence_no,
           c.sequence_no AS sequence_no, c.part_no as part_no, c.component_part as component_part, c.lvl as lvl, c.cost as cost, c.qty as qty, c.purch_curr as purch_curr
  ONE ROW PER MATCH
  AFTER MATCH SKIP TO NEXT ROW
  -- C => child row
  -- S* => zero or more siblings or children of siblings that might be 
  --           between child and its parent
  -- P? => parent row, which may not exist (e.g., for the root operation)
  PATTERN (C S* P?)
  DEFINE
    C AS 1=1,
    S AS S.lvl >= C.lvl,
    P AS P.lvl = C.lvl - 1 AND P.component_part = C.part_no
)
ORDER BY model_no, revision, sequence_no )
SELECT * FROM corrected_hierarchy;
+----------+----------+--------------------+-------------+---------+----------------+-----+------+-----+------------+
| MODEL_NO | REVISION | PARENT_SEQUENCE_NO | SEQUENCE_NO | PART_NO | COMPONENT_PART | LVL | COST | QTY | PURCH_CURR |
+----------+----------+--------------------+-------------+---------+----------------+-----+------+-----+------------+
|       62 |        1 |                    |           0 | XXX     | ABC            |   1 |      |   1 | GBP        |
|       62 |        1 |                  0 |          10 | ABC     | 123            |   2 |      |   1 | GBP        |
|       62 |        1 |                 10 |          20 | 123     | DEF            |   3 |      |   1 | GBP        |
|       62 |        1 |                 20 |          30 | DEF     | 456            |   4 |  100 |   1 | GBP        |
|       62 |        1 |                 20 |          40 | DEF     | 789            |   4 |   50 |   1 | GBP        |
|       62 |        1 |                 20 |          50 | DEF     | 024            |   4 |   20 |   1 | GBP        |
|       62 |        1 |                  0 |          60 | ABC     | 356            |   2 |      |   1 | GBP        |
|       62 |        1 |                 60 |          70 | 356     | DEF            |   3 |      |   1 | GBP        |
|       62 |        1 |                 70 |          80 | DEF     | 456            |   4 |  100 |   1 | GBP        |
|       62 |        1 |                 70 |          90 | DEF     | 789            |   4 |   50 |   1 | GBP        |
|       62 |        1 |                 70 |         100 | DEF     | 024            |   4 |   20 |   1 | GBP        |
+----------+----------+--------------------+-------------+---------+----------------+-----+------+-----+------------+

现在,如果需要,您可以就此停下来。您所需要做的就是在corrected_hierarchy函数中使用calc_cost逻辑,替换

    and part_no in (
      select component_part
      ...

使用

    and parent_sequence_no = sequence_no_

但是,正如@Def所指出的那样,您确实不需要PL / SQL函数来完成您想做的事情。

您似乎想做的是打印一个层次物料清单,其中包含每个物料的层级成本(层级成本是该物料的直接和间接子组件的成本)。

以下是执行此操作的查询,将所有内容放在一起:

with prod_conf_cost_struct_clv ( model_no, revision, sequence_no, part_no, component_part, lvl, cost, qty, purch_curr ) as
( 
SELECT 62, 1, 00, 'XXX', 'ABC', 1, null, 1, 'GBP' FROM DUAL UNION ALL
SELECT 62, 1, 10, 'ABC', '123', 2, null, 1, 'GBP' FROM DUAL UNION ALL
SELECT 62, 1, 20, '123', 'DEF', 3, null, 1, 'GBP' FROM DUAL UNION ALL
SELECT 62, 1, 30, 'DEF', '456', 4, 100, 1, 'GBP' FROM DUAL UNION ALL
SELECT 62, 1, 40, 'DEF', '789', 4, 50, 1, 'GBP' FROM DUAL UNION ALL
SELECT 62, 1, 50, 'DEF', '024', 4, 20, 1, 'GBP' FROM DUAL UNION ALL
SELECT 62, 1, 60, 'ABC', '356', 2, null, 1, 'GBP' FROM DUAL UNION ALL
SELECT 62, 1, 70, '356', 'DEF', 3, null, 1, 'GBP' FROM DUAL UNION ALL
SELECT 62, 1, 80, 'DEF', '456', 4, 100, 1, 'GBP' FROM DUAL UNION ALL
SELECT 62, 1, 90, 'DEF', '789', 4, 50, 1, 'GBP' FROM DUAL UNION ALL
SELECT 62, 1, 100, 'DEF', '024', 4, 20, 1, 'GBP' FROM DUAL )
-- Step 1: correct for your data model problem, which is the fact that child rows
-- (e.g., operations 30-50) are not *explicitly* linked to their parent rows (e.g.,
-- operation 20)
, corrected_hierarchy ( model_no, revision, parent_sequence_no, sequence_no, part_no, component_part, lvl, cost, qty, purch_curr ) AS
(
SELECT *
FROM   prod_conf_cost_struct_clv c
MATCH_RECOGNIZE (
  PARTITION BY model_no, revision
  ORDER BY sequence_no desc
  MEASURES (P.sequence_no) AS parent_sequence_no,
           c.sequence_no AS sequence_no, c.part_no as part_no, c.component_part as component_part, c.lvl as lvl, c.cost as cost, c.qty as qty, c.purch_curr as purch_curr
  ONE ROW PER MATCH
  AFTER MATCH SKIP TO NEXT ROW
  PATTERN (C S* P?)
  DEFINE
    C AS 1=1,
    S AS S.lvl >= C.lvl,
    P AS P.lvl = C.lvl - 1 AND P.component_part = C.part_no
)
ORDER BY model_no, revision, sequence_no ),
sequence_hierarchy_costs as (
SELECT model_no,
       revision,
       min(sequence_no) sequence_no,
       purch_curr,
       sum(h.qty * h.cost) hierarchy_cost
FROM corrected_hierarchy h
WHERE 1=1
connect by model_no = prior model_no
and        revision = prior revision
and        parent_sequence_no = prior sequence_no
group by model_no, revision, connect_by_root sequence_no, purch_curr )
SELECT level,
       sys_connect_by_path(h.sequence_no, '->') path,
       shc.hierarchy_cost
FROM corrected_hierarchy h 
INNER JOIN sequence_hierarchy_costs shc ON shc.model_no = h.model_no and shc.revision = h.revision and shc.sequence_no = h.sequence_no and shc.purch_curr = h.purch_curr
WHERE h.model_no = 62
and   h.revision = 1
START WITH h.sequence_no = 20
connect by h.model_no = prior h.model_no
and        h.revision = prior h.revision
and        h.parent_sequence_no = prior h.sequence_no;
+-------+----------+----------------+
| LEVEL |   PATH   | HIERARCHY_COST |
+-------+----------+----------------+
|     1 | ->20     |            170 |
|     2 | ->20->30 |            100 |
|     2 | ->20->40 |             50 |
|     2 | ->20->50 |             20 |
+-------+----------+----------------+

如果您将parent_sequence_no放在数据模型中,那么您会发现这容易得多。

答案 1 :(得分:1)

假设sequence_no列严格遵循树的深度优先遍历,则可以通过两种方式重建丢失的子/父关系。首先,我们可以为每个孩子找到一个父母sequence_no,或者为父母的孩子找到一个sequence_no的开放时间间隔。使用OP中提供的数据(无货币列)

with prod_conf_cost_struct_clv (model_no, revision, sequence_no, part_no, component_part, lvl, cost) as
( 
SELECT 62, 1, 00, 'XXX', 'ABC', 1, null FROM DUAL UNION ALL
SELECT 62, 1, 10, 'ABC', '123', 2, null FROM DUAL UNION ALL
SELECT 62, 1, 20, '123', 'DEF', 3, null FROM DUAL UNION ALL
SELECT 62, 1, 30, 'DEF', '456', 4, 100  FROM DUAL UNION ALL
SELECT 62, 1, 40, 'DEF', '789', 4, 50 FROM DUAL UNION ALL
SELECT 62, 1, 50, 'DEF', '024', 4, 20 FROM DUAL UNION ALL
SELECT 62, 1, 60, 'ABC', '356', 2, null FROM DUAL UNION ALL
SELECT 62, 1, 70, '356', 'DEF', 3, null FROM DUAL UNION ALL
SELECT 62, 1, 80, 'DEF', '456', 4, 100 FROM DUAL UNION ALL
SELECT 62, 1, 90, 'DEF', '789', 4, 50 FROM DUAL UNION ALL
SELECT 62, 1, 100, 'DEF', '024', 4, 20 FROM DUAL )
, hier as(
SELECT  model_no, revision, sequence_no, part_no, component_part, lvl, cost
 , (SELECT nvl(min(b.sequence_no), 2147483647/*max integer*/) 
    FROM prod_conf_cost_struct_clv b 
    WHERE a.lvl <> b.lvl-1
    AND a.sequence_no < b.sequence_no) child_bound_s_n
 , (SELECT max(b.sequence_no) 
    FROM prod_conf_cost_struct_clv b 
    WHERE a.lvl = b.lvl+1
    AND a.sequence_no > b.sequence_no) parent_s_n
FROM prod_conf_cost_struct_clv a
)
SELECT model_no, revision, sequence_no,parent_s_n,child_bound_s_n, part_no, component_part, lvl, cost
FROM hier;

该行的孩子说SEQUENCE_NO = 20处于(SEQUENCE_NO, CHILD_BOUND_S_N)打开间隔(20, 60)中。

MODEL_NO REVISION SEQUENCE_NO   PARENT_S_N  CHILD_BOUND_S_N PART_NO COMPONENT_PART  LVL COST
62       1            0                             20      XXX     ABC             1   
62       1           10          0                  30      ABC     123             2   
62       1           20         10                  60      123     DEF             3   
62       1           30         20                  40      DEF     456             4   100
62       1           40         20                  50      DEF     789             4    50
62       1           50         20                  60      DEF     024             4    20
62       1           60          0                  80      ABC     356             2   
62       1           70         60          2147483647      356     DEF             3   
62       1           80         70                  90      DEF     456             4   100
62       1           90         70                 100      DEF     789             4    50
62       1          100         70          2147483647      DEF     024             4    20

为了最大程度地减少对原始calc_cost函数的更改,第二种方法看起来更适合此处。因此,再次没有货币数据

CREATE FUNCTION calc_cost(
    model_no_ number, 
    revision_ number, 
    sequence_no_ in number
    --, currency_ in varchar2
  ) return number 
  is
    qty_ number := 0;
    cost_ number := 0;
    lvl_ number := 0;
  begin

    select 1 /*nvl(new_qty, qty)*/, cost, lvl
      into qty_, cost_, lvl_
    from prod_conf_cost_struct_clv
    where model_no = model_no_
      and revision = revision_
      and sequence_no = sequence_no_
      --and (purch_curr = currency_ or purch_curr is null)
      ;

    if cost_ is null then 
      select sum(calc_cost(model_no, revision, sequence_no/*, purch_curr*/)) into cost_ 
      from prod_conf_cost_struct_clv 
      where model_no = model_no_
        and revision = revision_
        --and (purch_curr = currency_ or purch_curr is null)
        and sequence_no > sequence_no_  
        and sequence_no < (SELECT nvl(min(b.sequence_no), 2147483647) 
                      FROM prod_conf_cost_struct_clv b 
                      WHERE lvl_ <> b.lvl-1
                      AND sequence_no_ < b.sequence_no);
    end if;
    return qty_ * cost_;
  exception when no_data_found then 
    return 0;
  end calc_cost;

并应用于上面的数据

SELECT calc_cost(62,1,20) FROM DUAL;

CALC_COST(62,1,20)
170

使用层次结构查询

with hier as(
 SELECT  model_no, revision, sequence_no, part_no, component_part, lvl, cost
   ,(SELECT nvl(min(b.sequence_no), 2147483647) 
     FROM prod_conf_cost_struct_clv b 
     WHERE a.lvl <> b.lvl-1
     AND a.sequence_no < b.sequence_no) child_bound_s_n
 FROM prod_conf_cost_struct_clv a
)
select level, sys_connect_by_path(sequence_no, '->') path, 
     calc_cost(model_no, revision, sequence_no) total_gbp
from hier
where model_no = 62
  and revision = 1
connect by sequence_no > prior sequence_no 
   and sequence_no < prior child_bound_s_n
  and prior model_no = 62
  and prior revision = 1
start with sequence_no = 20
order by sequence_no;

LEVEL   PATH    TOTAL_GBP
1   ->20        170
2   ->20->30    100
2   ->20->40    50
2   ->20->50    20

答案 2 :(得分:0)

您实际上需要该功能吗?似乎您实际上正在寻找的是零件及其每个组件(以及它们的递归组件)的计算。试试这个:

SELECT sub.root_part, sum(price) AS TOTAL_PRICE
FROM (SELECT CONNECT_BY_ROOT t.part_no AS ROOT_PART, price
      FROM (SELECT DISTINCT model_no, revision, part_no, component_part, price
            FROM prod_conf_cost_struct_clv
            WHERE model_no = 62
            AND revision = 1 )t
      CONNECT BY PRIOR component_part = part_no
      --START WITH part_no = '123'
      ) sub
GROUP BY sub.root_part;

我已注释掉“开始”,但如果您只在寻找那个ID,则可以将其放回去。