Question

我们正在尝试使用hivecontext（1.6.0）运行一个配置单元查询，但得到一个'AnalysisException'。查询如下：

 select  coalesce( an, dan),  case when  coalesce( ts, dts) is null then null else ( add_seconds( to_timestamp( concat( to_char( sub_seconds(  coalesce( ts, dts),81368), 'yyyyMMdd'), '000000'), 'yyyyMMddHHmmss'), 81368) ) end,  sum( case when ( mmm in (  1 ) and mgk is null ) then 1 else 0 end ),  sum( case when ( mmm in (  2 ) and mgk is null ) then 1 else 0 end ),  sum( case when ( mmm = 3 and dco_ids is not null ) then 1 else 0 end ),  sum( case when ( mmm = 3 and dco_ids is null and mgk is null ) then 1 else 0 end ),  sum( case when ( mgk is not null ) then 1 else 0 end ) from mrdm group by  coalesce( an, dan),  case when  coalesce( ts, dts) is null then null else ( add_seconds( to_timestamp( concat( to_char( sub_seconds(  coalesce( ts, dts),81368), 'yyyyMMdd'), '000000'), 'yyyyMMddHHmmss'), 81368) ) end

来自配置单元的查询错误是：

Caused by: org.apache.spark.sql.AnalysisException: expression 'ts' is neither present in the group by, nor is it an aggregate function. Add to group by or wrap in first() (or first_value) if you don't care which value you get.;
at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.failAnalysis(CheckAnalysis.scala:38)
at org.apache.spark.sql.catalyst.analysis.Analyzer.failAnalysis(Analyzer.scala:44)

Answer 1

尝试使用派生表，这样您就不必重新定义case语句：

select 
    c1, c2,
    sum(.....),
    sum(.....)
from (
    select *,
        coalesce(an, dan) c1,
        case when ... end c2           
    from mrdm 
) t group by c1, c2

使用分组依据进行Hive查询：无法解析

1 个答案: