Question

说我有一个BigQuery表，如下所示：

select 'bob' as pk, cast('2019-02-28 00:00:00' as datetime) as updated, 13 as val
union all select 'bob' as pk, cast('2019-03-01 00:00:00' as datetime) as updated, 15 as val
union all select 'joe' as pk, cast('2019-03-05 00:00:00' as datetime) as updated, 7 as val
union all select 'joe' as pk, cast('2019-03-07 00:00:00' as datetime) as updated, 9 as val
union all select 'tim' as pk, cast('2019-03-02 00:00:00' as datetime) as updated, 15 as val

我需要为每个唯一的pk保留1行，并删除其余的行。如果重复pk，则会保留具有最新updated时间戳记的行，并删除其余的行。然后，在上面的示例中，删除第1行和第3行，因为这两个updated有更多的pks值。

我正在使用python BigQuery客户端，并且我想将DELETE FROM语法用作shown here。更重要的是（a）删除行，而不是那么重要（b）使用select语句返回过滤后的表。

BigQuery从表中删除重复的主键，按日期时间排序

0 个答案: