我正在尝试连接两个表,每个表都有一个数组列,如下所示
SELECT a.id, b.value
FROM a INNER JOIN b
ON a.array IN b.array
或
SELECT a.id, b.value
FROM a INNER JOIN b
ON UNNEST(a.array) IN UNNEST(b.array)
根据this SO question,postgres有像< @ 和> @ 这样的运算符,可以比较其中一个是否是另一个数组的子集( postgres doc page)但BigQuery只允许将数组元素与其他数组进行比较,如下所示
a.arrayelement IN UNNEST(b.array)
可以在BigQuery中完成吗?
修改
这是我正在使用的架构
WITH b AS (
{ "ip": "192.168.1.1",
"cookie": [
{ "key": "apple",
"value: "red"
},
{ "key": "peach",
"value: "pink"
},
{ "key": "orange",
"value: "orange"
}
]
}
,{ "ip": "192.168.1.2",
"cookie": [
{ "key": "apple",
"value: "red"
},
{ "key": "orange",
"value: "orange"
}
]
}
),
WITH a AS (
{ "id": "12345",
"cookie": [
{ "key": "peach",
"value: "pink"
}
]
}
,{ "id": "67890",
"cookie": [
{ "key": "apple",
"value: "red"
},
{ "key": "orange",
"value: "orange"
},
]
}
)
我期待输出如下
ip, id
192.168.1.1, 67890
192.168.1.2, 67890
192.168.1.2, 12345
这是以下SO的延续, How do I find elements in an array in BigQuery。 我尝试使用子查询来比较其中一个数组的单个元素,但BigQuery返回一个错误,说我有"太多的子查询"
答案 0 :(得分:6)
这是一个替代解决方案,它避免在相关子查询中运行JOIN,而是依赖于IN UNNEST()表达式 - 这应该会提供更好的性能:
#standardSQL
WITH a AS (
SELECT 1 AS id, [2,4] AS a_arr UNION ALL
SELECT 2, [3,5]
),
b AS (
SELECT 11 AS value, [1,2,3,4] AS b_arr UNION ALL
SELECT 12, [1,3,5,6]
)
SELECT a.id, b.value
FROM a , b
WHERE (SELECT LOGICAL_AND(a_i IN UNNEST(b.b_arr)) FROM UNNEST(a.a_arr) a_i)
答案 1 :(得分:4)
尝试以下示例(BigQuery Standard SQL)
SELECT a.id, b.value
FROM a INNER JOIN b
ON a.array IN b.array
它模仿伪代码:
http://ionicframework.com/docs/resources/ng2-translate/
如果您希望我将此应用于您的示例,请告诉我 - 或者您将首先尝试自己:o)