我想比较两列中的两个数组,并在第三列中返回这两个数组中相同字符串的数量。
|---------------------|------------------|------------------|
| column 1 | column 2 | column 3 |
|---------------------|------------------|------------------|
| [cat, dog, bird] | [cat, bird] | 2 |
|---------------------|------------------|------------------|
| [cat, bear, tiger] | [tiger] | 1 |
|---------------------|------------------|------------------|
| [cat, tiger] | [tiger, cat] | 2 |
|---------------------|------------------|------------------|
答案 0 :(得分:1)
您可以使用unnest()
。假设各个数组没有重复项:
with t as (
select array['cat', 'dog', 'bird'] as column1, array['cat', 'bird'] as column2 union all
select array['cat', 'bear', 'tiger'], array['tiger'] union all
select array['cat', 'tiger'], array['tiger', 'cat']
)
select t.*,
(select count(*)
from unnest(column1) el1 join
unnest(column2) el2
on el1 = el2
) as column3
from t;
答案 1 :(得分:0)
假设数组中没有重复项-以下是另一种选择
#standardSQL
SELECT *,
ARRAY_LENGTH(ARRAY(
SELECT item
FROM UNNEST(column1 || column2) AS item
GROUP BY item
HAVING COUNT(1) > 1
)) AS column3
FROM `project.dataset.table`
以上查询的作用-合并两个数组,然后删除所有非重复项,同时保留唯一的重复项列表,最后计算结果数组中的元素数
还有,我认为最简单/直接的方法是
#standardSQL
SELECT *,
(SELECT COUNT(1) FROM (
SELECT * FROM t.column1 INTERSECT DISTINCT
SELECT * FROM t.column2
)) AS column3
FROM `project.dataset.table` t
我认为最新版本不需要任何解释
显然,以上两个版本都将预期输出返回为
Row column1 column2 column3
1 [cat, dog, bird] [cat, bird] 2
2 [cat, bear, tiger] [tiger] 1
3 [cat, tiger] [tiger, cat] 2