Question

我正在努力处理我必须编写的SQL请求，这是我的背景：

我有两张要迁移的表：

TABLE_A
╦═══════╦══════╗
║  Id   ║Value ║
╬═══════╬══════╣
║ 1     ║    a ║
║ 2     ║    a ║
║ 3     ║    a ║
║ 4     ║    b ║
║ 5     ║    b ║
║ 6     ║    b ║
╩═══════╩══════╝

TABLE_B
╔════╦═══════╦══════╗
║ Id ║  Id_A ║Value ║
╠════╬═══════╬══════╣
║  1 ║ 1     ║    x ║
║  2 ║ 2     ║    x ║
║  3 ║ 3     ║    x ║
║  4 ║ 4     ║    x ║
║  5 ║ 5     ║    x ║
║  6 ║ 6     ║    x ║
╚════╩═══════╩══════╝

我想获得这个结果：

TABLE_A
╦═══════╦══════╗
║  Id   ║Value ║
╬═══════╬══════╣
║ 1     ║    a ║
║ 4     ║    b ║
╩═══════╩══════╝

TABLE_B
╔════╦═══════╦══════╗
║ Id ║  Id_A ║Value ║
╠════╬═══════╬══════╣
║  1 ║ 1     ║    x ║
║  2 ║ 1     ║    x ║
║  3 ║ 1     ║    x ║
║  4 ║ 4     ║    x ║
║  5 ║ 4     ║    x ║
║  6 ║ 4     ║    x ║
╚════╩═══════╩══════╝

编辑：此处的逻辑是删除TABLE_A中的重复值。但问题是，当我们删除TABLE_A中的行时，TABLE_B中的相关id（id_A）不再对应。这就是我们期望TABLE_B的结果数据的原因。

对于TABLE_A，我认为这个请求可以这样做：

DELETE FROM TABLE_A WHERE ID NOT IN (SELECT distinct ID_A FROM TABLE_B)

但对于TABLE_B，我不知道该怎么做......

有什么想法吗？非常感谢!!!

让

Answer 1

首先，在您更新table_a之前，不要删除table_b中的所有行...

我不喜欢我在Oracle中的更新，所以有人可能会给出一个更清晰的答案：）

UPDATE
  table_b
SET
  id_a = (SELECT MIN(tgt.id)
            FROM table_a   src
      INNER JOIN table_a   tgt ON src.value = tgt.value
           WHERE src.id = table_b.id_a
         )

然后，您可以删除table_a中＆＃34;重复＆＃34;中的所有记录。 （保留ID最低的行）

Answer 2

显然你必须先处理table_b。通常，当您必须根据另一个表中的数据更新一个表时，merge语句比update更容易使用（并且更灵活）。通过比较，从table_a删除所需的行更容易。

<强>设置：

（注意table_b中的值与您的不同 - 我将它们区分为能够测试merge语句是否正确。

create table table_a ( id, value ) as
  select 1, 'a' from dual union all
  select 2, 'a' from dual union all
  select 3, 'a' from dual union all
  select 4, 'b' from dual union all
  select 5, 'b' from dual union all
  select 6, 'b' from dual
;

create table table_b ( id, id_a, value ) as 
  select 1, 1, 'x1' from dual union all
  select 2, 2, 'x2' from dual union all
  select 3, 3, 'x3' from dual union all
  select 4, 4, 'x4' from dual union all
  select 5, 5, 'x5' from dual union all
  select 6, 6, 'x6' from dual
;

首先更新table_b中的行：

merge into table_b t
  using (
          select b.id, x.min_id_a
          from   table_b b inner join table_a a on b.id_a = a.id
                           inner join ( 
                                        select   min(id) as min_id_a, value
                                        from     table_a
                                        group by value
                                      ) x
                                                on a.value = x.value
        ) s
    on ( t.id = s.id )
when matched then update set t.id_a = s.min_id_a
;

验证

ID  ID_A  VALUE
--  ----  -----
 1     1  x1
 2     2  x2
 3     3  x3
 4     4  x4
 5     5  x5
 6     6  x6

从table_a删除：

delete from
  ( select a.id, x.min_id
    from   table_a a inner join
           ( select   min(id) as min_id, value
             from     table_a
             group by value
           ) x
                on a.value = x.value
  ) t
where id != min_id
;

验证

ID  VALUE
--  -----
 1  a
 4  b

小心：Oracle是一个多用户环境。 table_a语句（更新merge）读取table_b中的数据，但读取操作的时间与后期{{1}的时间之间存在危险}}语句（以消除delete的重复）另一个用户或进程修改table_a，使得修改两个表的最终结果不正确。无论如何，你必须防止这种情况发生。你如何做到这取决于你没有与我们分享的东西，但只要记住你必须考虑的事情。

Answer 3

这应该在一次交易中完成。

首先通过在table_a中使用dense_rank来更新table_b（即，dense_rank（）= 1）。

update table_b
set Id_A = (
    select anew.Id 
    from table_a a
    join (select dra.Id, dra.Val from ( select a2.Id, a2.Val, dense_rank() over (partition by a2.Val order by a2.Id) dr from table_a a2 ) as dra where dra.dr = 1) anew
    on a.Val = anew.Val
    where table_b.Id_A = a.Id )

然后在table_a中删除您不想要的内容（例如，dense_rank（）＆lt;＆gt; 1）。

delete table_a
where Id in (
    select sq.Id
    from ( select a.Id, a.Val, dense_rank() over (partition by a.Val order by a.Id) dr from table_a a ) sq
    where sq.dr <> 1 )

如何从另一个表的重复行替换第一个引用？

3 个答案: