因此一直在处理一些数据,目前已按照
的行进行输出客户|原因
客户1 |答案1,答案3,答案2,答案4,答案5,答案1,答案3,答案1
Big Query标准sql中是否存在要消除此字符串中重复项并以下面的输出结尾的内容?
客户|原因
客户1 |答案1,答案3,答案2,答案4,答案5
预先感谢
答案 0 :(得分:4)
假设我正确理解了这个问题,则需要类似以下内容的
', '
这将在DISTINCT
分隔符上拆分字符串,然后将子字符串聚合为一个新字符串,并使用(oal)
关键字删除了重复项。
答案 1 :(得分:2)
投票支持Elliott的答案时-想要添加另一个选项(BigQuery标准SQL):
#standardSQL
WITH `project.dataset.table` AS (
SELECT 'Customer1' customer, 'Answer1, Answer3, Answer2, Answer4, Answer5, Answer1, Answer3, Answer1' answers
)
SELECT * REPLACE(
ARRAY_TO_STRING(ARRAY(SELECT DISTINCT answer
FROM UNNEST(SPLIT(answers, ', ')) AS answer
), ', ') AS answers)
FROM `project.dataset.table`
产生所需的结果
Row customer answers
1 Customer1 Answer1, Answer3, Answer2, Answer4, Answer5
如果出于某种原因您希望对这些值进行排序-您只需添加如下一行
#standardSQL
WITH `project.dataset.table` AS (
SELECT 'Customer1' customer, 'Answer1, Answer3, Answer2, Answer4, Answer5, Answer1, Answer3, Answer1' answers
)
SELECT * REPLACE(
ARRAY_TO_STRING(ARRAY(SELECT DISTINCT answer
FROM UNNEST(SPLIT(answers, ', ')) AS answer
ORDER BY answer
), ', ') AS answers)
FROM `project.dataset.table`
结果为
Row customer answers
1 Customer1 Answer1, Answer2, Answer3, Answer4, Answer5
注意:最有可能需要订购与您问题中的特定用例无关-在其他情况下可以方便使用