我有一张像
这样的表格header: source, user, metric1, metric2,...
data:
source1, user1, metrics..
source1, user2, metrics..
source2, user1, metrics..
source3, user1, metrics...
source3, user3, metrics...
...
我想为每个来源找不到的用户汇总一些指标。 在上面的示例中,我想提取source2:users2和user3并获得avg或其指标。
看起来像:
select avg(metric1) from tbl as tbl1
where user is not in
(select user from tbl as tbl2 where tbl1.source=tbl2.source)
group by source
根据文档页面,上面的查询在Legacy SQL中不起作用:
(https://cloud.google.com/bigquery/docs/reference/legacy-sql)
但在ANSI中,我得到了Resources exceeded
答案 0 :(得分:1)
如果内部SELECT返回的记录数很大,通常使用WHERE IN / NOT IN不被认为是一种好习惯。尝试使用LEFT OUTER JOIN重写您的查询,如
select tbl1.source as source, avg(tbl1.metric1) as avg_metric1
from tbl as tbl1
left outer join tbl as tbl2 on tbl1.source = tbl2.source
where tbl2.user is null
group by source