在Bigquery中连接表,只更新新字符串

时间:2018-02-02 18:03:03

标签: google-bigquery

我有以下两个数据表:

╔═════════╦══════════╦════════╦══════════╗
║ Keyword ║ Category ║ Amount ║  Update  ║
╠═════════╬══════════╬════════╬══════════╣
║ dog     ║ Animal   ║      2 ║ 1/1/2018 ║
║ fish    ║ Animal   ║      4 ║ 1/1/2018 ║
║ cat     ║ Animal   ║      5 ║ 1/1/2018 ║
║ bird    ║ Animal   ║      7 ║ 1/1/2018 ║
║ bike    ║ Other    ║      1 ║ 1/1/2018 ║
║ rabbit  ║ Animal   ║     11 ║ 1/1/2018 ║
╚═════════╩══════════╩════════╩══════════╝


╔═════════╦══════════╦════════╦══════════╗
║ Keyword ║ Category ║ Amount ║  Update  ║
╠═════════╬══════════╬════════╬══════════╣
║ lion    ║ Animal   ║      2 ║ 1/2/2018 ║
║ snake   ║ Animal   ║      9 ║ 1/2/2018 ║
║ cat     ║ Animal   ║     18 ║ 1/2/2018 ║
║ bird    ║ Animal   ║     13 ║ 1/2/2018 ║
║ bike    ║ Other    ║      1 ║ 1/2/2018 ║
║ bottle  ║ Other    ║     11 ║ 1/2/2018 ║
╚═════════╩══════════╩════════╩══════════╝

哪个SQL查询(在BigQuery中)将导致下表?

╔═════════╦══════════╦════════╦══════════╗
║ Keyword ║ Category ║ Amount ║  Update  ║
╠═════════╬══════════╬════════╬══════════╣
║ dog     ║ Animal   ║      2 ║ 1/1/2018 ║
║ fish    ║ Animal   ║      4 ║ 1/1/2018 ║
║ cat     ║ Animal   ║     18 ║ 1/2/2018 ║
║ bird    ║ Animal   ║     13 ║ 1/2/2018 ║
║ rabbit  ║ Animal   ║     11 ║ 1/1/2018 ║
║ lion    ║ Animal   ║      2 ║ 1/2/2018 ║
║ snake   ║ Animal   ║      9 ║ 1/2/2018 ║
║ bike    ║ Other    ║      1 ║ 1/2/2018 ║
║ bottle  ║ Other    ║     11 ║ 1/2/2018 ║
╚═════════╩══════════╩════════╩══════════╝

要求 - 如果在上一个表中尚未找到,请添加新关键字 - 如果第一个表中的关键字仅更新金额和日期

3 个答案:

答案 0 :(得分:0)

这是一种方法:

select t2.*
from t2
union all
select t1.*
from t1
where not exists (select 1 from t2 where t2.keyword = t1.keyword);

这将从第二个表和第一个表中的不匹配行中获取所有内容。

答案 1 :(得分:0)

表C看起来像表B加上表A中不存在于表B中的记录

如果你需要表C然后创建tablec作为select * from tableb然后插入tablec ..或者如果你只想要数据,你可以插入tableb。

INSERT
  INTO tableb
    ( SELECT keyword,category,amount, UPDATE
        FROM tablea
        WHERE NOT EXISTS
          (SELECT 'x' FROM tableb WHERE tableb.keyword = tablea.keyword
          )
    );

答案 2 :(得分:0)

这是我开展工作的方式:

SELECT * FROM dataset.t2
UNION DISTINCT
SELECT * FROM dataset.t1
WHERE t1.Keyword NOT IN (SELECT Keyword FROM `project.dataset.t2`)