我有一个带有如下字段的表:
ID Field 1 Field 2
1 22,34,05,44,44 01,02,02,03
2 11,01,05 02,02,01,01,22
我该如何在BigQuery(strandardSQL)中将其转换为仅显示唯一记录并从大到小排序?
这样输出将如下所示:
ID Field 1 Field 2
1 05,22,34,44 01,02,03
2 01,05,11 01,02,22
我尝试使用Split
,但随后却运行了数百个重复项,而且window
函数也不允许distinct
稍后再将它们组合在一起。
请帮助弄清楚
答案 0 :(得分:1)
您可以将字符串拆分成数组,然后使用DISTINCT
进行重复数据删除并使用ORDER BY
进行排序:
SELECT
ID,
ARRAY(SELECT DISTINCT x FROM UNNEST(SPLIT(field1, ',')) AS x ORDER BY x) AS field1,
ARRAY(SELECT DISTINCT x FROM UNNEST(SPLIT(field2, ',')) AS x ORDER BY x) AS field2
FROM `project-name`.dataset.table
如果要再次将数组转换为逗号分隔的字符串,可以使用ARRAY_TO_STRING
函数:
SELECT
ID,
ARRAY_TO_STRING(ARRAY(SELECT DISTINCT x FROM UNNEST(SPLIT(field1, ',')) AS x ORDER BY x), ',') AS field1,
ARRAY_TO_STRING(ARRAY(SELECT DISTINCT x FROM UNNEST(SPLIT(field2, ',')) AS x ORDER BY x), ',') AS field2
FROM `project-name`.dataset.table