我们正在尝试使用hivecontext(1.6.0)运行一个配置单元查询,但得到一个'AnalysisException'。查询如下:
select coalesce( an, dan), case when coalesce( ts, dts) is null then null else ( add_seconds( to_timestamp( concat( to_char( sub_seconds( coalesce( ts, dts),81368), 'yyyyMMdd'), '000000'), 'yyyyMMddHHmmss'), 81368) ) end, sum( case when ( mmm in ( 1 ) and mgk is null ) then 1 else 0 end ), sum( case when ( mmm in ( 2 ) and mgk is null ) then 1 else 0 end ), sum( case when ( mmm = 3 and dco_ids is not null ) then 1 else 0 end ), sum( case when ( mmm = 3 and dco_ids is null and mgk is null ) then 1 else 0 end ), sum( case when ( mgk is not null ) then 1 else 0 end ) from mrdm group by coalesce( an, dan), case when coalesce( ts, dts) is null then null else ( add_seconds( to_timestamp( concat( to_char( sub_seconds( coalesce( ts, dts),81368), 'yyyyMMdd'), '000000'), 'yyyyMMddHHmmss'), 81368) ) end
来自配置单元的查询错误是:
Caused by: org.apache.spark.sql.AnalysisException: expression 'ts' is neither present in the group by, nor is it an aggregate function. Add to group by or wrap in first() (or first_value) if you don't care which value you get.;
at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.failAnalysis(CheckAnalysis.scala:38)
at org.apache.spark.sql.catalyst.analysis.Analyzer.failAnalysis(Analyzer.scala:44)
答案 0 :(得分:1)
尝试使用派生表,这样您就不必重新定义case
语句:
select
c1, c2,
sum(.....),
sum(.....)
from (
select *,
coalesce(an, dan) c1,
case when ... end c2
from mrdm
) t group by c1, c2