Impala:所有DISTINCT聚合函数都需要具有相同的参数集

时间:2016-08-30 19:56:58

标签: sql hdfs impala

我的Impala查询中出现以下错误:

select 
   upload_key, 
   max(my_timestamp) as upload_time, 
   max(color_key) as max_color_fk, 
   count(distinct color_key) as color_count, 
   count(distinct id) as toy_count 
from upload_table 
group by upload_key

并收到错误:

  

AnalysisException:所有DISTINCT聚合函数都需要具有   与count相同的参数集(DISTINCT color_key);偏离   function:count(DISTINCT id)

我不确定为什么会收到此错误。我所做的是为每个小组(按upload_key分组),我尝试计算了多少distinct id以及多少distinct color_key
有没有人有任何想法

1 个答案:

答案 0 :(得分:7)

错误消息表明DISTINCT仅允许在一个列[组合] 上,但您尝试两个color_key& id。解决方法是两个选择,然后是连接:

select
   t1.upload_key,
   t1.upload_time,
   t1.max_color_fk,
   t1.color_count,
   t2.toy_count
from
 (
   select 
      upload_key, 
      max(my_timestamp) as upload_time, 
      max(color_key) as max_color_fk, 
      count(distinct color_key) as color_count
   from upload_table 
   group by upload_key
 ) as t1
join
 (
   select 
      upload_key
      count(distinct id) as toy_count 
   from upload_table 
   group by upload_key
 ) as t2
on t1. upload_key = t2.upload_key