无法将窗口调用分解为组。错误:org.apache.hadoop.hive.ql.parse

时间:2016-08-01 19:54:53

标签: hadoop hive hiveql

我尝试连接两个表中的两列,以生成一列的唯一id.Max列值,其中包含另一个表的行号。

select (MAX(S.m_id))from MPPO S;
select row_number() OVER(ORDER BY G.a,G.r,G.f1,STG.filler2,G.n_p,G.fe,G.se) 
FROM mmp G
LEFT OUTER JOIN mppo S
ON TRIM(G.pc) = S.pc;

但是将这两个查询结合起来如下:

select (MAX(S.m_id))+ row_number() OVER(ORDER BY G.a,G.r,G.f1,STG.filler2,G.n_p,G.fe,G.se) 
FROM mmp G LEFT OUTER JOIN mppo S
ON TRIM(G.pc) = S.pc;

我收到以下错误:

SemanticException Failed to breakup Windowing invocations into Groups. At least 1 group
must only depend on input columns. Also check for circular dependencies. Underlying error:
org.apache.hadoop.hive.ql.parse.SemanticException

我做错了什么?请帮忙

2 个答案:

答案 0 :(得分:2)

分别从每个表中选择id然后加入输出:

select concat(t.id,'',t1.id) from (select MAX(S.m_id) as id from MPPO s) t join (
select row_number() OVER(ORDER BY G.a,G.r,G.f1,STG.filler2,G.n_p,G.fe,G.se)  as id
FROM mmp G
LEFT OUTER JOIN mppo S
ON TRIM(G.pc) = S.pc) t1 on 1=1

答案 1 :(得分:1)

我在Hive中遇到类似的问题。我想分享我的经验,以防万一有人遇到类似问题

select语句具有以下分区子句

COUNT () OVER (PARTITION BY mc.source_well_key, mc.report_dt order by mc.report_dt ) AS r_number

正确的语法是

replace count() with count(1) or count(*)