我有两个大查询表。表1具有模式{id:String, colors:Array[String]}
,看起来像
| id | colors |
|------|-----------------------------|
| id_1 | ["blue", "green", "orange"] |
| id_2 | ["red" , "blue", "green" ] |
| ... | .... |
和表2将颜色与具有模式{color:String, number:Int}
的数字相关联,看起来像
| color | number |
|-------|--------|
| "blue"| 0 |
| "red" | 1 |
| ... | ... |
我想生成一张看起来像这样的表
| id | numbers |
|----|---------|
|id_1| [0,3,4] |
|id_2| [1,0,3] |
| ...|... |
通过将表1中的每种颜色映射到其对应的数字获得。我唯一能想到的解决方案是
SELECT id, ARRAY_AGG(number) AS numbers
FROM (table_1 CROSS JOIN UNNEST(table_1.colors) as color) JOIN table_2 USING(color)
GROUP BY email
但这会花费超长的时间(可能是交叉联接的cuz)
答案 0 :(得分:0)
您也可以这样表达:
SELECT email,
(SELECT ARRAY_AGG(number) AS numbers
FROM UNNEST(table_1.colors) color JOIN
table_2
USING (color)
) as colors
FROM table_1;
我不确定每行的“本地”聚合是否比BigQuery中的“整体”聚合更好。但这值得一试。
答案 1 :(得分:0)
以下是用于BigQuery标准SQL
#standardSQL
SELECT id,
ARRAY(
SELECT number FROM table_1.colors color
JOIN `project.dataset.table_2` USING (color)
) AS numbers
FROM `project.dataset.table_1` table_1
您可以使用问题中的示例数据来进行测试,如上示例所示
#standardSQL
WITH `project.dataset.table_1` AS (
SELECT 'id_1' id, ["blue", "green", "orange"] colors UNION ALL
SELECT 'id_2', ["red" , "blue", "green" ]
), `project.dataset.table_2` AS (
SELECT 'blue' color, 0 number UNION ALL
SELECT 'red', 1 UNION ALL
SELECT 'green', 3 UNION ALL
SELECT 'orange', 4
)
SELECT id,
ARRAY(
SELECT number FROM table_1.colors color
JOIN `project.dataset.table_2` USING (color)
) AS numbers
FROM `project.dataset.table_1` table_1
有结果
答案 2 :(得分:0)
像这样简单的事情
select id, array_agg(number) as numbers from (
select id, c, t2.number from table_1 t1, unnest(t1.colors) c
join table_2 t2 on c = t2.color
)
group by 1