Question

我有两个简单的表：

来源

id    count  date
6       30  10-28
7       80  10-29
5       20  10-28
4       10  10-27

目的地

id    count  date   
7       10  10-29
5       90  10-28
6       10  10-28

我想要的是将源的内容合并到一个目标中，在此目标中它们的ID和日期匹配，并且比较并选择count的最大值。如果目标中还没有该ID + date的行，则查询还应该能够将源中的一行插入目标。

运行查询后，目标应类似于：

id    count  date   
7       80  10-29
5       90  10-28
6       30  10-28
4       10  10-27

到目前为止，这是我想到的查询，但实际上无法更新目标表，并且无法使用MERGE。我也不确定它的效率：

select id, max(count), date from (
   select id, max(count) as count, date from source group by id, count, date
   union
   select id, max(count) as count, date from destination group by id, count, date
)
group by id, date;

我正在使用Amazon Redshift运行查询。

谢谢！

Answer 1

greatest可以与left join一起使用：

select s.id, greatest(s.count,d.count) as count, 
       s.date
  from source s
  left join destination d 
    on ( s.id = d.id and s.date = d.date );

P.S。如果greatest内的列表中的值（对于最小的情况为least）为NULL，则将其忽略。

如果只是不想选择而只想更改目标表而没有merge语句，则可以使用 CTAS （ create table as ）语句，如下所示：以下代码块：

create table destination2 as 
select s.id, greatest(s.count,d.count) as count, s.date
  from source s
  left join destination d 
    on ( s.id = d.id and s.date = d.date );

delete from destination;

insert into destination
select * from destination2;

drop table destination2;

select * from destination;

Answer 2

当然，MERGE只是（更有效的）替代了众所周知的两步过程：

-- first update existing id/date combinations
update dest
set count = src.count
from source
where dest.id    = src.id
  and dest.date  = src.date
  and dest.count < src.count;

-- then insert new id/date combinations
insert into dest
select id, count, date
from src
where not exists
 ( select * from dest
   where dest.id    = src.id
     and dest.date  = src.date
 );

Answer 3

您可以使用union all或full join生成表：

select id, date, max(count) as count
from ((select id, date, count from source
      ) union all
      (select id, date, count from destination
      )
     ) t
group by id, date;

如果要将这些结果保存在表中，我倾向于建议创建一个新表并替换旧表：

create table new_destination as
    select id, date, max(count) as count
    from ((select id, date, count from source
          ) union all
          (select id, date, count from destination
          )
         ) t
    group by id, date;

truncate table destination;

insert into destination (id, date, count)
    select id, date, count
    from destination;

如何将两个表合并在一起，选择具有较高值的列，而又不能使用MERGE语句？

3 个答案:

如何将两个表合并在一起，选择具有较高值的​​列，而又不能使用MERGE语句？

3 个答案:

如何将两个表合并在一起，选择具有较高值的列，而又不能使用MERGE语句？