BigQuery Update对具有多行的表进行合并

时间:2019-12-17 13:46:05

标签: google-bigquery bigdata inner-join jointable

我有以下内容:

Table A:

|uid|info|..
|123|null|..



Table B:
|uid|goodinfo|timestamp|
|123  |  3     |2019-12-12
|123  |  5     |2019-01-12
|234  |  11    |2019-10-12

当我尝试运行update语句时,我总是会收到“ UPDATE / MERGE必须与每个目标行最多匹配一个源行”的错误,因为在表BI中会获得多行,并且我没有进行连接的任何方式比这更具体。

我尝试过:

UPDATE `Table A` a
SET info = (select goodinfo from `Table B` where uid=123
ORDER BY lastmodifieddate DESC
LIMIT 1) b
WHERE 
a.info IS NULL AND
a.user_id=123

-这种方法有效,但是因为在SubQuery中我没有访问表A的权限,所以无法将其概括为以下内容:

SET info = (select goodinfo from `Table B` where uid=a.uid
ORDER BY lastmodifieddate DESC
LIMIT 1) b

-这引发了一个错误,说他不知道谁是“ a.uid”

然后我尝试使用BigQuery的合并:

MERGE `Table A` a 
USING (
  select goodinfo,uid from `Table B` 
  ORDER BY lastmodifieddate DESC
  LIMIT 1
) b
ON a.uid = b.uid 
WHEN MATCHED and a.info is null and DATE(a.timestamp) = "2019-12-12" THEN
  UPDATE SET a.info = b.goodinfo

-该查询实际上已成功完成,但由于我尚未找到的原因而未修改任何行

然后我尝试了:

UPDATE `Table A` a 
SET a.info = b.goodinfo
FROM `Table B` b
WHERE a.uid = b.uid
and DATE(a.timestamp) = "2019-12-12"
and a.info IS NULL
//here I get the same error and I cannot filter the data from Table B and get the same error

有没有想到以通用方式更新数据并以某种方式过滤表B中的数据并在加入时从goodinfo中仅获得值“ 3”?

我也在考虑做一个:

WITH filtered_table_b(
  select uid, goodinfo from Table B
  ORDER BY lastmodifieddate DESC
  LIMIT 1
)

但这无济于事,因为我不知何故需要根据时间戳为每个用户选择最后一个好信息

谢谢

1 个答案:

答案 0 :(得分:2)

这是您可以使用的标准SQL:

WITH data AS (
select '123' as uid, 3 as goodinfo, DATE('2019-12-12') as timestamp union all
select '123' as uid, 5 as goodinfo, DATE('2019-01-12') as timestamp union all
select '234' as uid, 11 as goodinfo, DATE('2019-10-12') as timestamp 
),
filterData AS (
select uid, max(timestamp) maxTimestamp from data
group by uid
)

select data.uid, goodinfo, filterData.maxTimestamp as  maxTimestamp 
from data inner join filterData on data.uid = filterData.uid and data.timestamp = filterData.maxTimestamp

这是上面的输出:

enter image description here