我的目标是查找最频繁的值,并使用BigQuery按用户ID对其进行分组。它应该能够计算出每个用户ID使用的语言数量,并且结果应该返回最高的语言。但是,我发现错误说
No matching signature for aggregate function AVG for argument types: STRING. Supported signatures: AVG(INT64); AVG(FLOAT64); AVG(NUMERIC) at [3:5]
这是我的代码:
SELECT * FROM(
SELECT COUNT(*) AS cnt,
AVG(Language) AS mean,
APPROX_TOP_COUNT(Language, 1)[OFFSET(0)].value AS most_frequent_value
FROM `language`
WHERE Language IS NOT NULL
GROUP BY User_ID);
我应该更改什么,以便结果返回每个用户ID首选的语言值?
存储的生产者:
CASE
WHEN Preferred_Language in ('EN', 'English') THEN 'EN'
ELSE Preferred_Language
END AS Preferred_Language,
答案 0 :(得分:2)
以下是BigQuery标准SQL
#standardSQL
SELECT
User_ID,
ARRAY_AGG(Language ORDER BY cnt DESC LIMIT 1)[OFFSET(0)] most_frequent_language
FROM (
SELECT
User_ID,
Language,
COUNT(*) AS cnt
FROM `project.dataset.language`
WHERE Language IS NOT NULL
GROUP BY User_ID, Language
)
GROUP BY User_ID