我正在尝试合并一堆表,其中一些字段是bigint类型的数组。当我尝试在Hive中投射这些内容时,它会一直给我一个“表达式不在GROUP BY键'field_name'中。”
我的代码如下:
CREATE TABLE sample_table stored AS avro AS
SELECT cast(cast(rand()* 1000000000000000000 AS BIGINT) AS string) id,
'PERSON'AS person,
name AS name,
age AS age,
gender AS gender
FROM table_a
UNION ALL
SELECT cast(cast(rand()* 1000000000000000000 AS BIGINT) AS string) id,
'PERSON'AS person,
name AS name,
collect_list(
CAST(NULL AS BIGINT)
) AS age,
null AS gender
FROM table_b
正在生成的错误如下:
SQL错误[500051] [HY000]:[Cloudera] HiveJDBCDriver错误正在处理查询/语句。错误代码:10025,SQL状态:TStatus(statusCode:ERROR_STATUS,infoMessages:[* org.apache.hive.service.cli.HiveSQLException:编译语句时发生错误:FAILED:SemanticException [Error 10025]:第4:7行表达式不在GROUP BY键'age':28:27,org.apache.hive.service.cli.operation.Operation:toSQLException:Operation.java:400,org.apache.hive.service.cli.operation.SQLOperation:prepare:SQLOperation .java:187
答案 0 :(得分:1)
collect_list
是聚合功能。您需要group by
每个非常数字段。
SELECT cast(cast(rand()* 1000000000000000000 AS BIGINT) AS string) id,
'PERSON'AS person,
name,
collect_list(CAST(NULL AS BIGINT)) AS age,
null AS gender
FROM table_b
group by cast(cast(rand()* 1000000000000000000 AS BIGINT) AS string), name