Question

我有这个示例表

PK_HASH | PERSON_ID | VALID_FROM | VALID_TO  | CREATION_DATE | NAME  | SURNAME |
------------------------------------------------------------------------------
456a8ed1| 000001    | 01.01.2016 | 31.01.2016| 03.01.2016    | John  | Smith   |
a48e4b22| 000001    | 01.01.2016 | 31.01.2016| 04.01.2016    | James | Smith   |
788fee89| 000001    | 01.01.2016 | 31.01.2016| 05.01.2016    | James | null    |
42cba184| 000001    | 01.01.2016 | 31.01.2016| 12.01.2016    | null  | null    |
5bcc48ad| 000002    | 01.01.2016 | 31.01.2016| 03.01.2016    | Mike  | Legend  |
e48da448| 000003    | 01.01.2016 | 31.01.2016| 03.01.2016    | Karl  | Rogel   |
889775ea| 000003    | 01.01.2016 | 31.01.2016| 05.01.2016    | Carl  | null    |

是否可以为Oracle准备一些合并SQL-Command，其结果将是

PK_HASH | PERSON_ID | VALID_FROM | VALID_TO  | CREATION_DATE | NAME  | SURNAME |
------------------------------------------------------------------------------
456a8ed1| 000001    | 01.01.2016 | 31.01.2016| 03.01.2016    | James | Smith   |
5bcc48ad| 000002    | 01.01.2016 | 31.01.2016| 03.01.2016    | Mike  | Legend  |
e48da448| 000003    | 01.01.2016 | 31.01.2016| 03.01.2016    | Carl  | Rogel   |

含义：

数据应按列对（PERSON_ID，VALID_FROM）
只有一个最低 CREATION_DATE的唯一行应保留每个PERSON_ID，VALID_FROM
如果NAME或SURNAME有一些变化，则值shuold会更改/合并到这一行（例如000001：John - ＆gt; James，或000003：Karl - ＆gt; Carl）
如果存在'null'值 - 这不会改变，需要忽略。
因此，CREATION_DATE最高的NAME或SURNAME应该合并到一个具有最低CREATION_DATE（期望'null'值）的唯一行中

我尝试准备一些MERGE命令，但实际上不确定是否可以使用它。

我需要更改表的内容，我不需要一些结果集。因此，需要进行一些更新和删除。

@Please不要试图理解表中的数据。它是虚构的，仅用于描述有问题的

非常感谢您的每一次帮助

Answer 1

这很容易通过使用分析函数来实现：

    with sample_data as (select '456a8ed1' pk_hash, 1 person_id, to_date('01/01/2016', 'dd/mm/yyyy') valid_from, to_date('31/01/2016', 'dd/mm/yyyy') valid_to, to_date('03/01/2016', 'dd/mm/yyyy') creation_date, 'John' name, 'Smith' surname from dual union all
                         select 'a48e4b22' pk_hash, 1 person_id, to_date('01/01/2016', 'dd/mm/yyyy') valid_from, to_date('31/01/2016', 'dd/mm/yyyy') valid_to, to_date('04/01/2016', 'dd/mm/yyyy') creation_date, 'James' name, 'Smith' surname from dual union all
                         select '788fee89' pk_hash, 1 person_id, to_date('01/01/2016', 'dd/mm/yyyy') valid_from, to_date('31/01/2016', 'dd/mm/yyyy') valid_to, to_date('05/01/2016', 'dd/mm/yyyy') creation_date, 'James' name, null surname from dual union all
                         select '42cba184' pk_hash, 1 person_id, to_date('01/01/2016', 'dd/mm/yyyy') valid_from, to_date('31/01/2016', 'dd/mm/yyyy') valid_to, to_date('12/01/2016', 'dd/mm/yyyy') creation_date, null name, null surname from dual union all
                         select '5bcc48ad' pk_hash, 2 person_id, to_date('01/01/2016', 'dd/mm/yyyy') valid_from, to_date('31/01/2016', 'dd/mm/yyyy') valid_to, to_date('03/01/2016', 'dd/mm/yyyy') creation_date, 'Mike' name, 'Legend' surname from dual union all
                         select 'e48da448' pk_hash, 3 person_id, to_date('01/01/2016', 'dd/mm/yyyy') valid_from, to_date('31/01/2016', 'dd/mm/yyyy') valid_to, to_date('03/01/2016', 'dd/mm/yyyy') creation_date, 'Karl' name, 'Rogel' surname from dual union all
                         select '889775ea' pk_hash, 3 person_id, to_date('01/01/2016', 'dd/mm/yyyy') valid_from, to_date('31/01/2016', 'dd/mm/yyyy') valid_to, to_date('05/01/2016', 'dd/mm/yyyy') creation_date, 'Carl' name, null surname from dual)
    -- end of mimicking a table called "sample_data" containing your data. You wouldn't need this, since you have your table.
    -- See SQL below:
    select pk_hash,
           person_id,
           valid_from,
           valid_to,
           creation_date,
           latest_name name,
           latest_surname surname
    from   (select pk_hash,
                   person_id,
                   valid_from,
                   valid_to,
                   creation_date,
                   row_number() over (partition by person_id, valid_from order by creation_date) rn,
                   last_value(name ignore nulls) over (partition by person_id, valid_from order by creation_date
                                                       rows between unbounded preceding and unbounded following) latest_name,
                   last_value(surname ignore nulls) over (partition by person_id, valid_from order by creation_date
                                                          rows between unbounded preceding and unbounded following) latest_surname
            from   sample_data)
    where  rn = 1;

    PK_HASH   PERSON_ID VALID_FROM VALID_TO   CREATION_DATE NAME  SURNAME
    -------- ---------- ---------- ---------- ------------- ----- -------
    456a8ed1          1 01.01.2016 31.01.2016 03.01.2016    James Smith  
    5bcc48ad          2 01.01.2016 31.01.2016 03.01.2016    Mike  Legend 
    e48da448          3 01.01.2016 31.01.2016 03.01.2016    Carl  Rogel  

The `row_number()` function labels each row in the set (in this case, the set is across person_id and valid_from, and ordered by creation_date) with a number.

The `last_value()` function is simply finding the last non-null value in the specified columns across all rows in the set, ordered by the creation_date.

Then you just filter the results to pick the first row.

以下是一个测试用例，展示了如何使用上述查询作为merge语句的一部分进行更新/删除：

创建包含数据的表格：

create table sample_data as
select '456a8ed1' pk_hash, 1 person_id, to_date('01/01/2016', 'dd/mm/yyyy') valid_from, to_date('31/01/2016', 'dd/mm/yyyy') valid_to, to_date('03/01/2016', 'dd/mm/yyyy') creation_date, 'John' name, 'Smith' surname from dual union all
select 'a48e4b22' pk_hash, 1 person_id, to_date('01/01/2016', 'dd/mm/yyyy') valid_from, to_date('31/01/2016', 'dd/mm/yyyy') valid_to, to_date('04/01/2016', 'dd/mm/yyyy') creation_date, 'James' name, 'Smith' surname from dual union all
select '788fee89' pk_hash, 1 person_id, to_date('01/01/2016', 'dd/mm/yyyy') valid_from, to_date('31/01/2016', 'dd/mm/yyyy') valid_to, to_date('05/01/2016', 'dd/mm/yyyy') creation_date, 'James' name, null surname from dual union all
select '42cba184' pk_hash, 1 person_id, to_date('01/01/2016', 'dd/mm/yyyy') valid_from, to_date('31/01/2016', 'dd/mm/yyyy') valid_to, to_date('12/01/2016', 'dd/mm/yyyy') creation_date, null name, null surname from dual union all
select '5bcc48ad' pk_hash, 2 person_id, to_date('01/01/2016', 'dd/mm/yyyy') valid_from, to_date('31/01/2016', 'dd/mm/yyyy') valid_to, to_date('03/01/2016', 'dd/mm/yyyy') creation_date, 'Mike' name, 'Legend' surname from dual union all
select 'e48da448' pk_hash, 3 person_id, to_date('01/01/2016', 'dd/mm/yyyy') valid_from, to_date('31/01/2016', 'dd/mm/yyyy') valid_to, to_date('03/01/2016', 'dd/mm/yyyy') creation_date, 'Karl' name, 'Rogel' surname from dual union all
select '889775ea' pk_hash, 3 person_id, to_date('01/01/2016', 'dd/mm/yyyy') valid_from, to_date('31/01/2016', 'dd/mm/yyyy') valid_to, to_date('05/01/2016', 'dd/mm/yyyy') creation_date, 'Carl' name, null surname from dual;

运行merge语句进行更新并删除：

merge into sample_data tgt
using (select pk_hash,
              person_id,
              valid_from,
              valid_to,
              creation_date,
              row_number() over (partition by person_id, valid_from order by creation_date) rn,
              last_value(name ignore nulls) over (partition by person_id, valid_from order by creation_date
                                                  rows between unbounded preceding and unbounded following) latest_name,
              last_value(surname ignore nulls) over (partition by person_id, valid_from order by creation_date
                                                     rows between unbounded preceding and unbounded following) latest_surname
       from   sample_data) src
  on (tgt.pk_hash = src.pk_hash)
when matched then
  update set tgt.name = src.latest_name,
             tgt.surname = src.latest_surname
  -- need to update all the rows, in order to delete the ones we're not interested in, otherwise they
  -- won't be seen by the delete statement since we're basing the delete on the src.rn column:
  delete where src.rn != 1;

commit;

<强>输出：

select * from sample_data;

PK_HASH   PERSON_ID VALID_FROM VALID_TO   CREATION_DATE NAME  SURNAME
-------- ---------- ---------- ---------- ------------- ----- -------
456a8ed1          1 01.01.2016 31.01.2016 03.01.2016    James Smith  
5bcc48ad          2 01.01.2016 31.01.2016 03.01.2016    Mike  Legend 
e48da448          3 01.01.2016 31.01.2016 03.01.2016    Carl  Rogel

SQL Oracle - 将更多行合并到同一个表中的一行（更新一行，删除其他行）

1 个答案: