我已使用以下代码连接结果
SELECT
COLLECT_LIST(col_name) AS my_col
FROM my_table
这在某种程度上达到了我想要的结果,输出如下:
["car","motorcycle","bus"]
["train","boat"]
["airplane","bicycle"]
但是,我需要删除方括号和引号,然后再显示在后续业务报告中。
我尝试了以下各种迭代,但均无济于事:
regexp_extract(my_col,'\\[|\\]','')
这会显示错误消息
java.lang.Exception:org.apache.hive.service.cli.HiveSQLException:编译语句时出错:FAILED:SemanticException [Error 10014]:第6:0行参数错误'''':类没有匹配的方法org.apache.hadoop.hive.ql.udf.UDFRegExpExtract带(数组,字符串,字符串)。可能的选择: FUNC (字符串,字符串) FUNC (字符串,字符串,整数)
我如何达到我期望的...
car, motorcycle
train, boat
airplane, bicycle
regex_replace函数是执行此操作的最佳方法吗?
非常感谢指导。
答案 0 :(得分:3)
从一开始就使用concat_ws
连接字符串数组:
SELECT concat_ws(', ', collect_set(col_name)) AS my_col FROM my_table
^ ---------------------------------- ^
concat_ws
更合适,因为您有一个字符串数组,而不是单个字符串。
答案 1 :(得分:0)
hive> with t1 as (select cast(1 as string) as col1 union select cast(2 as string) as col1 union select cast(3 as string) as col1) select collect_set(col1), concat_ws(',', collect_set(col1)) from t1;
OK
["1","2","3"] 1,2,3
Time taken: 95.27 seconds, Fetched: 1 row(s)
concat_ws仅接受字符串值数组
如果项目数量固定,那么您也可以使用
hive> with t1 as (select cast(1 as string) as col1 union select cast(2 as string) as col1 union select cast(3 as string) as col1) select concat(collect_set(col1)[0], ',' ,collect_set(col1)[1], ',', collect_set(col1)[2]) from t1;