SparkSQL错误:collect_set()不能具有地图类型数据

时间:2019-06-18 13:17:04

标签: apache-spark hive apache-spark-sql

对于配置单元上的SparkSQL,当我在查询中使用named_struct时,它将返回结果:

SELECT id, collect_set(emp_info) as employee_info
FROM
    (
     SELECT t.id, named_struct("name", t.emp_name, "dept", t.emp_dept) AS emp_info
     FROM mytable t
    ) a
GROUP BY id

但是当我将named_struct替换为map时,出现错误消息:

SELECT id, collect_set(emp_info) as employee_info
FROM
    (
     SELECT t.id, map("name", t.emp_name, "dept", t.emp_dept) AS emp_info
     FROM mytable t
    ) a
GROUP BY id

ERROR yarn.ApplicationMaster: User class threw exception: org.apache.spark.sql.AnalysisException: cannot resolve 'collect_set(a.`emp_info`)' due to data type mismatch: collect_set() cannot have map type data; line 36 pos 27;
'Distinct

我希望返回namedept的地图,如何与collect_set一起使用? 仅供参考:这个带有map的查询在Hive(Hue)中运行没有问题

0 个答案:

没有答案