我有一张桌子
+--------+------------------+-----------+---------+-------------+
|visit_id|browsed_categories | num_seen| num_borrows |
+--------+------------------+-----------+---------+-------------+
|1 | fiction,history | 20 | 3 |
|2 | selfhelp,fiction,science | 15 | 3 |
|3 | cooking,kids,home,selfhelp | 7 | 2 |
+--------+------------------------------+---------+-------------+
,并且正在尝试对该表进行汇总,以查找不同的浏览类别和借贷项之间是否存在关联。
+-------------+---------------------------------+-------------------------+
| borrow_rate | num_distinct_browsed_categories | distinct_categories |
+-------------+---------------------------------+-------------------------+
| 0 | 3 | cooking,selfhelp,home |
| 1 | 2 | history,fiction |
+-------------+---------------------------------+-------------------------+
我的查询如下:
select
*,
count(distinct(split(all_cats, ','))) as num_distinct_browsed_categories
from
(
select
(num_borrows/num_seen) as borrow_rate,
count(visit_id) as num_visits,
group_concat(browsed_categories, ',') as all_cats
from [table]
group by borrow_rate
)
查询给我这个错误:
Cannot use count distinct with scoped aggregation
如何修改查询以获取所需的输出?
答案 0 :(得分:2)
以下是BigQuery标准SQL的版本
#standardSQL
SELECT
*,
(SELECT COUNT(DISTINCT cat) FROM UNNEST(SPLIT(all_cats, ',')) cat) AS num_distinct_browsed_categories
FROM (
SELECT
(num_borrows/num_seen) AS borrow_rate,
COUNT(visit_id) AS num_visits,
STRING_AGG(browsed_categories, ',') AS all_cats
FROM `project.dataset.table`
GROUP BY borrow_rate
)
顺便说一句,如果由于某些原因您仍然绑定BigQuery旧版SQL,只需替换
count(distinct(split(all_cats, ',')))
与
exact_count_distinct(split(all_cats, ','))
在原始查询中