循环的游标为嵌套的XQUERY创建重复条目

时间:2013-05-04 09:46:45

标签: oracle plsql xml-parsing xquery xmltable

我创建了一个将xml数据解析为多个表的过程。我正在捕获主键约束的异常,如果在结果中找到重复,它将被插入名为DUPLICATE的表中。
现在,当我使用光标时,它往往会迭代超过所需的次数,即1. 程序

DECLARE
PER_ID varchar2(20);
    NAME varchar2(20);
SECTIONS_ID varchar2(20);
SECTIONS_NAME varchar2(20);
    var1 number;
    exception_var number;
CURSOR C1 IS
    select d.department_id
       , d.department_name
       , s.sections_id
      , s.sections_name
   from xml_unit_download t
     , xmltable(
         '/ROWSET/DATA'
         passing t.xml_file
         columns
           DEPARTMENT_ID   varchar2(20) path 'DEPARTMENT/DEPARTMENT_ID'
         , DEPARTMENT_NAME varchar2(30) path 'DEPARTMENT/DEPARTMENT_NAME'
         , SECTIONS        xmltype      path 'SECTIONS'
       ) d
     , xmltable(
         '/SECTIONS'
         passing d.sections
         columns
           SECTIONS_ID     varchar2(20) path 'SECTIONS_ID'
        , SECTIONS_NAME   varchar2(30) path 'SECTIONS_NAME'
      ) s
 where
  t.Status = 4;
  BEGIN

  FOR R_C1 IN C1 LOOP
      BEGIN
      insert into DEPARTMENT(id, name) values(R_C1.PER_ID, R_C1.name);
      insert into SECTIONS(id, name) values(R_C1.SECTIONS_ID, R_C1.SECTIONS_NAME);
      var1:= var1+1;
       dbms_output.put_line('Insert=' || var1);
      commit;
           --dbms_output.put_line('Duplicate='||var);
      EXCEPTION
         WHEN DUP_VAL_ON_INDEX THEN
         dbms_output.put_line('Duplicate=');
         insert into duplicate(id, name)values(R_C1.id, R_C1_name);
      END;
      END LOOP;
  END;

我将如何处理这种情况?我也尝试过使用INSERT ALL,但似乎没有用。这是我尝试INSERT ALL程序

DECLARE
PER_ID varchar2(20);
    NAME varchar2(200);
    var1 number;
    exception_var number;

  BEGIN

      insert all
      into SECTIONS (id) values(department_id)

      --into sect (id, name) values(s.SECTIONS_ID, s.SECTIONS_NAME )
   select d.department_id
       , d.department_name
       , s.sections_id
      , s.sections_name
   from xml_unit_download t
     , xmltable(
         '/ROWSET/DATA'
         passing t.xml_file
         columns
           "DEPARTMENT_ID"   varchar2(20) path 'DEPARTMENT/DEPARTMENT_ID'
         , "DEPARTMENT_NAME" varchar2(30) path 'DEPARTMENT/DEPARTMENT_NAME'
         , "SECTIONS"        xmltype      path 'SECTIONS'
       ) d
     , xmltable(
         '/SECTIONS'
         passing d.sections
         columns
           "SECTIONS_ID"     varchar2(20) path 'SECTIONS_ID'
        , "SECTIONS_NAME"   varchar2(30) path 'SECTIONS_NAME'
      ) s
 where
  t.Status = 4;
  dbms_output.put_line('Insert=' || var1);
      var1:= var1+1;
       dbms_output.put_line('Insert=' || var1);
      commit;
           --dbms_output.put_line('Duplicate='||var);
      EXCEPTION
         WHEN DUP_VAL_ON_INDEX THEN
         --insert into
         dbms_output.put_line('Duplicate=');
  END;

正在查询的XML包含DEPARTMENT及其SECTIONS的数据。 DEPARTMENT与SECTIONS有一对多的关系,即DEPARTMENT可以有一个或多个SECTIONS,并且可能存在DEPARTMENT没有任何SECTIONS的情况。

XML的结构使得标签识别DEPARTMENT及其相应SECTIONS的集合 的 XML

<ROWSET> 
<DATA>
 <DEPARTMENT>
  <DEPARTMENT_ID>DEP1</DEPARTMENT_ID>
  <DEPARTMENT_NAME>myDEPARTMENT1</DEPARTMENT_NAME>
 </DEPARTMENT>
 <SECTIONS>
  <SECTIONS_ID>6390135666643567</SECTIONS_ID>
  <SECTIONS_NAME>mySection1</SECTIONS_NAME>
  </SECTIONS>
   <SECTIONS>
  <SECTIONS_ID>6390135666643567</SECTIONS_ID>
  <SECTIONS_NAME>mySection2</SECTIONS_NAME>
  </SECTIONS>
 </DATA>
 <DATA>
 <DEPARTMENT>
  <DEPARTMENT_ID>DEP2</DEPARTMENT_ID>
  <DEPARTMENT_NAME>myDEPARTMENT2</DEPARTMENT_NAME>
 </DEPARTMENT>
 <SECTIONS>
  <SECTIONS_ID>63902</SECTIONS_ID>
  <SECTIONS_NAME>mySection1</SECTIONS_NAME>
  </SECTIONS>
 </DATA>
<DATA>
 <DEPARTMENT>
  <DEPARTMENT_ID>DEP3</DEPARTMENT_ID>
  <DEPARTMENT_NAME>myDEPARTMENT3</DEPARTMENT_NAME>
 </DEPARTMENT>
</DATA>
</ROWSET>

2 个答案:

答案 0 :(得分:1)

由于每个部门可以有多个部分,因此您可以预期重复部分。你可以通过移动捕获异常的位置来获得你想要的东西,所以它仍然可以插入sections

  FOR R_C1 IN C1 LOOP
      BEGIN
         insert into DEPARTMENT(id, name)
            values(R_C1.department_id, R_C1.department_name);
      EXCEPTION
         WHEN DUP_VAL_ON_INDEX THEN
         dbms_output.put_line('Duplicate=');
         insert into duplicate(id, name)
            values(R_C1.department_id, R_C1.department_name);
      END;
      insert into SECTIONS(id, name)
         values(R_C1.SECTIONS_ID, R_C1.SECTIONS_NAME);
      var1:= var1+1;
      dbms_output.put_line('Insert=' || var1);
   END LOOP;

您还可以使用跟踪器变量(如果您看到的记录与前一个记录具有相同的department_id,请不要尝试插入department记录,只需执行{{1插入),或者您可以使用嵌套循环:

sections

这使用一个循环来获取部门信息(不会有重复项)和XMLTYPE部分,然后将该部分传递给第二个游标进行扩展。

declare
   cursor dept_cur is
      select d.department_id
         , d.department_name
         , d.sections
      from xml_unit_download t
         , xmltable(
            '/ROWSET/DATA'
            passing t.xml_file
            columns
              "DEPARTMENT_ID"   varchar2(20) path 'DEPARTMENT/DEPARTMENT_ID'
            , "DEPARTMENT_NAME" varchar2(30) path 'DEPARTMENT/DEPARTMENT_NAME'
            , "SECTIONS"        xmltype      path 'SECTIONS'
         ) d
      where
         t.Status = 4;

   cursor sect_cur(sections xmltype) is
      select s.sections_id
         , s.sections_name
      from xmltable(
            '/SECTIONS'
            passing sections
            columns
              "SECTIONS_ID"     varchar2(20) path 'SECTIONS_ID'
           , "SECTIONS_NAME"   varchar2(30) path 'SECTIONS_NAME'
         ) s;
begin
   for dept in dept_cur loop
      insert into department(id, name)
         values (dept.department_id, dept.department_name);
      for sect in sect_cur(dept.sections) loop
         insert into sections(id, name, department_id)
            values (sect.sections_id, sect.sections_name, dept.department_id);
      end loop;
   end loop;
end;
/

PL/SQL procedure successfully completed.

您不必使用PL / SQL,只需执行两次插入:

select * from department;

ID                             NAME
------------------------------ ------------------------------
DEP1                           myDEPARTMENT1
DEP2                           myDEPARTMENT2
DEP3                           myDEPARTMENT3

select * from sections;

ID                             NAME                           DEPARTMENT_ID
------------------------------ ------------------------------ ------------------------------
6390135666643567               mySection1                     DEP1
6390135666643567               mySection2                     DEP1
63902                          mySection1                     DEP2

......和:

insert into department(id, name)
select d.department_id
   , d.department_name
from xml_unit_download t
   , xmltable(
      '/ROWSET/DATA'
      passing t.xml_file
      columns
        "DEPARTMENT_ID"   varchar2(20) path 'DEPARTMENT/DEPARTMENT_ID'
      , "DEPARTMENT_NAME" varchar2(30) path 'DEPARTMENT/DEPARTMENT_NAME'
   ) d
where
   t.Status = 4;

...它将与PL / SQL块相同的数据放在表中。

在两者中我都假设你想要一个链接两个表的列,但是它们可能没有唯一的链接,你需要单独的insert into sections(id, name, department_id) select s.sections_id , s.sections_name , d.department_id from xml_unit_download t , xmltable( '/ROWSET/DATA' passing t.xml_file columns "DEPARTMENT_ID" varchar2(20) path 'DEPARTMENT/DEPARTMENT_ID' , "DEPARTMENT_NAME" varchar2(30) path 'DEPARTMENT/DEPARTMENT_NAME' , "SECTIONS" xmltype path 'SECTIONS' ) d , xmltable( '/SECTIONS' passing d.sections columns "SECTIONS_ID" varchar2(20) path 'SECTIONS_ID' , "SECTIONS_NAME" varchar2(30) path 'SECTIONS_NAME' ) s where t.Status = 4; section表,这些表很容易在同样的方式。

另请注意,这两种方法都会为department_section创建一个department记录,除非您使用之前答案中的外部联接,否则原始记录不会这样做;然后你必须注意到没有部分信息而不尝试第二次插入。

答案 1 :(得分:1)

这似乎是一个基本的基数问题 - 如果您在某个部门中有多个部分,那么您将在结果中为每个部门获得多行,因此您将看到重复的部门信息,因为程序代码而不是输入数据。

不是试图在一个查询中执行此操作,为什么不将其分解为两个游标/ for循环?

这样的事情:

BEGIN

  <<department_loop>>
  FOR r_department IN (
    SELECT 
      d.department_id
    , d.department_name
    , d.sections
    FROM xml_unit_download t
   , XMLTABLE(
     '/ROWSET/DATA'
     PASSING t.xml_file
     COLUMNS
       department_id   VARCHAR2(20) PATH 'DEPARTMENT/DEPARTMENT_ID'
     , department_name VARCHAR2(30) PATH 'DEPARTMENT/DEPARTMENT_NAME'
     , sections        XMLTYPE      path 'SECTIONS'
    ) d
    WHERE t.status = 4
  )
  LOOP

    BEGIN
      INSERT INTO departments (id, name)
      VALUES (r_department.department_id, r_department.department_name);
    EXCEPTION
      WHEN DUP_VAL_ON_INDEX THEN
        INSERT INTO department_duplicates (id, name)
        VALUES (r_department.department_id, r_department.department_name);
    END;

    <<section_loop>>
    FOR r_section IN (
      SELECT 
        s.sections_id
      , s.sections_name
      FROM XMLTABLE (
      '/SECTIONS'
      PASSING r_department.sections
      COLUMNS
        sections_id   VARCHAR2(20) PATH 'SECTIONS_ID'
      , sections_name VARCHAR2(30) PATH 'SECTIONS_NAME'
      ) s
    )
    LOOP

      BEGIN
        INSERT INTO sections (id, name, department_id)
        VALUES (r_section.sections_id, r_section.sections_name, r_department.department_id);
      EXCEPTION
        WHEN DUP_VAL_ON_INDEX THEN
          INSERT INTO section_duplicates (id, name, department_id)
          VALUES (r_section.sections_id, r_section.sections_name, r_department.department_id);
      END;

    END LOOP section_loop;

  END LOOP department_loop;

END;
/

这有以下好处:

  • 您可以根据部门和部门相当容易地捕获单个副本(如果找到)。
  • 由于程序代码,您没有引入基数 - 任何重复都是真正的输入数据重复。
  • 您不必担心跟踪您正在使用/已使用的行,它隐含在嵌套循环结构中。