我有两个表,如下所示:
Table 1:
[1,2,3,4,5]
Table 2:
[2,3,4]
[1,4]
[9,5,7]
我的目标是从表2中找到包含表1中元素数量最多的数组。在此示例中,预期结果将是表2中的记录[2,3,4]。
到目前为止,我有以下内容,但我正在努力整合最大元素逻辑:
#standardSQL
WITH query_items AS (
SELECT [96072688,25185958] AS items
),
lookup_values AS (
SELECT antecedent from recommendation_engine.association_rules
)
SELECT query_items.items, lookup_values.antecedent
FROM query_items, lookup_values, UNNEST([(SELECT ARRAY_LENGTH(query_items.items) - COUNT(1)
FROM UNNEST(query_items.items) AS input
JOIN UNNEST(lookup_values.antecedent) AS output
ON input = output)]) AS results
WHERE results = 0
在此先感谢您提供的任何帮助!
答案 0 :(得分:2)
下面的示例(适用于BigQuery Standard SQL)应该为您提供一个想法
#standardSQL
WITH `project.dataset.table1` AS (
SELECT [1,2,3,4,5] target
), `project.dataset.table2` AS (
SELECT [2,3,4] candidates UNION ALL
SELECT [1,4] UNION ALL
SELECT [9,5,7]
)
SELECT *,
(SELECT COUNT(1)
FROM t1.target x
JOIN t2.candidates y
ON x=y
) matches
FROM `project.dataset.table1` t1
CROSS JOIN `project.dataset.table2` t2
ORDER BY matches DESC
LIMIT 1
有结果
# target candidates matches
1 [1,2,3,4,5] [2,3,4] 3