应用错误收集

对于配置单元上的SparkSQL，当我在查询中使用named_struct时，它将返回结果：

SELECT id, collect_set(emp_info) as employee_info
FROM
    (
     SELECT t.id, named_struct("name", t.emp_name, "dept", t.emp_dept) AS emp_info
     FROM mytable t
    ) a
GROUP BY id

但是当我将named_struct替换为map时，出现错误消息：

SELECT id, collect_set(emp_info) as employee_info
FROM
    (
     SELECT t.id, map("name", t.emp_name, "dept", t.emp_dept) AS emp_info
     FROM mytable t
    ) a
GROUP BY id

ERROR yarn.ApplicationMaster: User class threw exception: org.apache.spark.sql.AnalysisException: cannot resolve 'collect_set(a.`emp_info`)' due to data type mismatch: collect_set() cannot have map type data; line 36 pos 27;
'Distinct

我希望返回name和dept的地图，如何与collect_set一起使用？仅供参考：这个带有map的查询在Hive（Hue）中运行没有问题

SparkSQL错误：collect_set（）不能具有地图类型数据

0 个答案: