跟进此问题 - Bigquery multiple unnest in a single select
我们正在使用bigquery作为我们的仓储解决方案,并试图通过尝试整合来突破极限。一个简单的例子是客户端跟踪。客户产生收入,在我们的网站上有几个接触点,并独立维护我们的几个帐户。对于想要对客户进行行为分析的商业用户,他们希望跟踪访问次数,产生的收入以及他们的帐户如何影响保留,我们正在尝试评估嵌套结构是否适合我们
以下是一个例子。我有3张桌子。
客户(C)
C_Key | C_Name
- - - - | ------
1 | ABC
2 | DEF
帐户(A)
A_Key | C_Key
11 | 1
12 | 1
21 | 2
22 | 2
23 | 2
收入(R)
R_Key | C_Key |收入
------- | --------- | ----------
11 | 1 | 10 $
12 | 1 | $ 20
21 | 2 | 10 $
我使用array_agg将这三个组合成一个嵌套的表,如下所示:
{Client,
Accounts:
[{
}],
Revenue:
[{
}]
}
我希望能够在单个查询中使用多个不必要的内容,如下所示
Select client, Count Distinct(Accounts) and SUM(Revenue) from <single nested
table>, unnest accounts, unnest revenue
预期输出为2行,
1,2,$ 30
2,3,$ 10
但是,在同一查询中多次使用不会产生交叉连接 实际输出是
1,2,$ 60
2,3,$ 30
答案 0 :(得分:0)
以下是BigQuery Standard SQL
首先让我们澄清single nested table
我希望你做了类似的事情:
#standardSQL
WITH clients AS (
SELECT 1 AS c_key, 'abc' AS c_name UNION ALL
SELECT 2, 'def'
), accounts AS (
SELECT 11 AS a_key, 1 AS c_key UNION ALL
SELECT 12, 1 UNION ALL
SELECT 21, 2 UNION ALL
SELECT 22, 2 UNION ALL
SELECT 23, 2
), revenue AS (
SELECT 11 AS r_key, 1 AS c_key, 10 AS revenue UNION ALL
SELECT 12, 1, 20 UNION ALL
SELECT 21, 2, 10
), single_nested_table AS (
SELECT x.c_key, x.c_name, accounts, revenue
FROM (
SELECT c.c_key, c_name, ARRAY_AGG(a) AS accounts --, array_agg(r) as revenue
FROM clients AS c
LEFT JOIN accounts AS a ON a.c_key = c.c_key
GROUP BY c.c_key, c_name
) x
JOIN (
SELECT c.c_key, c_name, ARRAY_AGG(r) AS revenue
FROM clients AS c
LEFT JOIN revenue AS r ON r.c_key = c.c_key
GROUP BY c.c_key, c_name
) y
ON x.c_key = y.c_key
)
SELECT *
FROM single_nested_table
将表创建为
Row c_key c_name accounts.a_key accounts.c_key revenue.r_key revenue.c_key revenue.revenue
1 1 abc 11 1 11 1 10
12 1 12 1 20
2 2 def 21 2 21 2 10
22 2
23 2
用于创建该表的确切查询并不重要 - 但清除结构/模式非常重要!
现在,回到你的问题
#standardSQL
WITH clients AS (
SELECT 1 AS c_key, 'abc' AS c_name UNION ALL
SELECT 2, 'def'
), accounts AS (
SELECT 11 AS a_key, 1 AS c_key UNION ALL
SELECT 12, 1 UNION ALL
SELECT 21, 2 UNION ALL
SELECT 22, 2 UNION ALL
SELECT 23, 2
), revenue AS (
SELECT 11 AS r_key, 1 AS c_key, 10 AS revenue UNION ALL
SELECT 12, 1, 20 UNION ALL
SELECT 21, 2, 10
), single_nested_table AS (
SELECT x.c_key, x.c_name, accounts, revenue
FROM (
SELECT c.c_key, c_name, ARRAY_AGG(a) AS accounts --, array_agg(r) as revenue
FROM clients AS c
LEFT JOIN accounts AS a ON a.c_key = c.c_key
GROUP BY c.c_key, c_name
) x
JOIN (
SELECT c.c_key, c_name, ARRAY_AGG(r) AS revenue
FROM clients AS c
LEFT JOIN revenue AS r ON r.c_key = c.c_key
GROUP BY c.c_key, c_name
) y
ON x.c_key = y.c_key
)
SELECT
c_key, c_name,
ARRAY_LENGTH(accounts) AS distinct_accounts,
(SELECT SUM(revenue) FROM UNNEST(revenue)) AS revenue
FROM single_nested_table
这给出了你的要求:
Row c_key c_name distinct_accounts revenue
1 1 abc 2 30
2 2 def 3 10