在Hive中

时间:2019-02-19 10:15:44

标签: sql regex hive hiveql

我已使用以下代码连接结果

SELECT 
COLLECT_LIST(col_name) AS my_col
FROM my_table

这在某种程度上达到了我想要的结果,输出如下:

["car","motorcycle","bus"]
["train","boat"]
["airplane","bicycle"]

但是,我需要删除方括号和引号,然后再显示在后续业务报告中。

我尝试了以下各种迭代,但均无济于事:

regexp_extract(my_col,'\\[|\\]','')

这会显示错误消息

  

java.lang.Exception:org.apache.hive.service.cli.HiveSQLException:编译语句时出错:FAILED:SemanticException [Error 10014]:第6:0行参数错误'''':类没有匹配的方法org.apache.hadoop.hive.ql.udf.UDFRegExpExtract带(数组,字符串,字符串)。可能的选择: FUNC (字符串,字符串) FUNC (字符串,字符串,整数)

我如何达到我期望的...

car, motorcycle
train, boat
airplane, bicycle 

regex_replace函数是执行此操作的最佳方法吗?

非常感谢指导。

2 个答案:

答案 0 :(得分:3)

从一开始就使用concat_ws连接字符串数组:

SELECT concat_ws(', ', collect_set(col_name)) AS my_col FROM my_table
       ^ ---------------------------------- ^    

concat_ws更合适,因为您有一个字符串数组,而不是单个字符串。

答案 1 :(得分:0)

hive> with t1 as (select cast(1 as string) as col1 union  select cast(2 as string) as col1 union  select cast(3 as string) as col1) select collect_set(col1), concat_ws(',', collect_set(col1)) from t1;
OK
["1","2","3"]   1,2,3
Time taken: 95.27 seconds, Fetched: 1 row(s)
  
    

concat_ws仅接受字符串值数组

  

如果项目数量固定,那么您也可以使用

hive> with t1 as (select cast(1 as string) as col1 union  select cast(2 as string) as col1 union  select cast(3 as string) as col1) select concat(collect_set(col1)[0], ',' ,collect_set(col1)[1], ',', collect_set(col1)[2]) from t1;