在配置单元窗口范围中使用表列

时间:2017-12-08 11:46:26

标签: hadoop apache-spark hive apache-spark-sql hiveql

我正在尝试通过窗口收集_list,我想根据同一个表中的一个列值动态限制窗口大小。

select concat_ws('->', collect_list(CASE WHEN b.colA IN ("bla", "blabla") THEN concat_ws("-", colB,colC) END) OVER (PARTITION BY colD ORDER BY time-stamp ROWS BETWEEN colE PRECEDING AND CURRENT ROW)) AS myCol from ( select colA,colB,colC,colD,colE from mytable) a

colA|colB|colC|colD|colE|time-stamp
bla|abc|pqr|INDIA|1|2017-12-10
bla|abc|pqr|CHINA|1|2017-12-11
bla|abc|pqr|INDIA|2|2017-12-12
bla|abc|pqr|INDIA|3|2017-12-13
bla|abc|pqr|CHINA|2|2017-12-14

这里的hive不接受具有有效numaric值的范围内的colE。我收到错误:

Error while compiling statement: FAILED: ParseException line 177:89 cannot recognize input near 'colE' 'preceding' 'AND' in windowframeboundary

1 个答案:

答案 0 :(得分:1)

来自文档here。 Hive不会在两个词之后删除任何列。以下是可用选项

(ROWS | RANGE) BETWEEN (UNBOUNDED | [num]) PRECEDING AND ([num] PRECEDING | CURRENT ROW | (UNBOUNDED | [num]) FOLLOWING)
(ROWS | RANGE) BETWEEN CURRENT ROW AND (CURRENT ROW | (UNBOUNDED | [num]) FOLLOWING)
(ROWS | RANGE) BETWEEN [num] FOLLOWING AND (UNBOUNDED | [num]) FOLLOWING