我想从配置单元表中以逗号分隔的字符串值中提取唯一值。
原样:
select * from data;
ID ITEMS
123 "ABB","REG","REG", "ABB","XYZ"
预期结果:
select ===some logic=== from data;
ID ITEMS
123 "ABB","REG","XYZ"
请提出建议。
答案 0 :(得分:2)
explode
通过split
设置csv字符串将csv值分成一行,并在拆分值上使用collect_set
删除重复项。结果将是array
,并使用concat_ws
来获取csv值。
select id,items,concat_ws(',',collect_set(split_item)) as result
from data
lateral view explode(split(items,',')) tbl as split_item
group by id,items