说我有一个BigQuery表,如下所示:
select 'bob' as pk, cast('2019-02-28 00:00:00' as datetime) as updated, 13 as val
union all select 'bob' as pk, cast('2019-03-01 00:00:00' as datetime) as updated, 15 as val
union all select 'joe' as pk, cast('2019-03-05 00:00:00' as datetime) as updated, 7 as val
union all select 'joe' as pk, cast('2019-03-07 00:00:00' as datetime) as updated, 9 as val
union all select 'tim' as pk, cast('2019-03-02 00:00:00' as datetime) as updated, 15 as val
我需要为每个唯一的pk
保留1行,并删除其余的行。如果重复pk
,则会保留具有最新updated
时间戳记的行,并删除其余的行。然后,在上面的示例中,删除第1行和第3行,因为这两个updated
有更多的pks
值。
我正在使用python BigQuery客户端,并且我想将DELETE FROM
语法用作shown here。更重要的是(a)删除行,而不是那么重要(b)使用select语句返回过滤后的表。