我目前有一个主要结果表(test1),其中存储了我所有的问题记录,还有另一个表(test2),该表每周大约运行一次,我试图找到那些每周都不存在的记录更新并更新主结果表中的日期,因为它是要在系统中进行更正的日期。
我试图将test2
表中的记录添加到test1
表中,如果它们尚未在表中。
这有效:
insert into table test1 (id, name, code)
select * from test2 t2 where t2.id not in (select id from test1);
我还尝试更新表test1
'Corrected_date'
列,以显示在test1
中找到但在test2
中找不到的所有记录的当前日期
以下示例数据:
表1
ID NAME CODE CORRECTED_DATE
1 TEST 3
29 TEST2 90
表2
ID NAME CODE
12 TEST5 20
1 TEST 3
表1的预期最终结果
ID NAME CODE CORRECTED_DATE
1 TEST 3
29 TEST2 90 3/13/2019
12 TEST5 20
答案 0 :(得分:0)
使用FULL JOIN覆盖表。 FULL JOIN
返回已合并的记录+未从左表中合并+未从右表中合并。您可以使用case语句来实现您的逻辑,如下所示:
insert OVERWRITE table test1
select
--select t1 if both or t1 only exist, t2 if only t2 exists
case when t1.ID is null then t2.ID else t1.ID end as ID,
case when t1.ID is null then t2.NAME else t1.NAME end as NAME,
case when t1.ID is null then t2.CODE else t1.CODE end as CODE,
--if found in t1 but not in t2 then current_date else leave as is
case when (t1.ID is not null) and (t2.ID is null) then current_date else t1.CORRECTED_DATE end as CORRECTED_DATE
from test1 t1
FULL OUTER JOIN test2 t2 on t1.ID=t2.ID;
另请参阅有关增量更新的类似问题,您的逻辑不同,但方法相同:https://stackoverflow.com/a/37744071/2700344
测试数据:
with test1 as (
select stack (2,
1, 'TEST', 3,null,
29,'TEST2', 90 , null
) as (ID,NAME,CODE,CORRECTED_DATE)
),
test2 as (
select stack (2,
12,'TEST5',20,
1,'TEST',3
) as (ID, NAME, CODE)
)
select
--select t1 if both or t1 only exist, t2 if only t2 exists
case when t1.ID is null then t2.ID else t1.ID end as ID,
case when t1.ID is null then t2.NAME else t1.NAME end as NAME,
case when t1.ID is null then t2.CODE else t1.CODE end as CODE,
--if found in test1 but not in test2 then current_date else leave as is
case when (t1.ID is not null) and (t2.ID is null) then current_date else t1.CORRECTED_DATE end as CORRECTED_DATE
from test1 t1
FULL OUTER JOIN test2 t2 on t1.ID=t2.ID;
结果:
OK
id name code corrected_date
1 TEST 3 NULL
12 TEST5 20 NULL
29 TEST2 90 2019-03-14
Time taken: 41.727 seconds, Fetched: 3 row(s)
结果符合预期。