Question

我已使用以下代码连接结果

SELECT 
COLLECT_LIST(col_name) AS my_col
FROM my_table

这在某种程度上达到了我想要的结果，输出如下：

["car","motorcycle","bus"]
["train","boat"]
["airplane","bicycle"]

但是，我需要删除方括号和引号，然后再显示在后续业务报告中。

我尝试了以下各种迭代，但均无济于事：

regexp_extract(my_col,'\\[|\\]','')

这会显示错误消息

java.lang.Exception：org.apache.hive.service.cli.HiveSQLException：编译语句时出错：FAILED：SemanticException [Error 10014]：第6：0行参数错误''''：类没有匹配的方法org.apache.hadoop.hive.ql.udf.UDFRegExpExtract带（数组，字符串，字符串）。可能的选择： FUNC （字符串，字符串） FUNC （字符串，字符串，整数）

我如何达到我期望的...

car, motorcycle
train, boat
airplane, bicycle

regex_replace函数是执行此操作的最佳方法吗？

非常感谢指导。

Answer 1

从一开始就使用concat_ws连接字符串数组：

SELECT concat_ws(', ', collect_set(col_name)) AS my_col FROM my_table
       ^ ---------------------------------- ^

concat_ws更合适，因为您有一个字符串数组，而不是单个字符串。

Answer 2

hive> with t1 as (select cast(1 as string) as col1 union  select cast(2 as string) as col1 union  select cast(3 as string) as col1) select collect_set(col1), concat_ws(',', collect_set(col1)) from t1;
OK
["1","2","3"]   1,2,3
Time taken: 95.27 seconds, Fetched: 1 row(s)

concat_ws仅接受字符串值数组

如果项目数量固定，那么您也可以使用

hive> with t1 as (select cast(1 as string) as col1 union  select cast(2 as string) as col1 union  select cast(3 as string) as col1) select concat(collect_set(col1)[0], ',' ,collect_set(col1)[1], ',', collect_set(col1)[2]) from t1;

在Hive中

2 个答案: