Piglatin中的窗口函数(rank over()等)

时间:2014-03-13 01:36:23

标签: apache-pig

是否可以在piglatin中实现Windowing功能?我目前正在使用不支持rank over()子句的Hive版本。

1 个答案:

答案 0 :(得分:2)

这可以帮助你:排名是在Pig v.11。(version 12 rank document)和over() was implemented as a piggybank object中实现的。

示例用法:

To do a cumulative sum:

 A = load 'T';
 B = group A by si
 C = foreach B {
     C1 = order A by d;
     generate flatten(Stitch(C1, Over(C1.f, 'sum(float)')));
 }
 D = foreach C generate s, $9;


This is equivalent to the SQL statement

select s, sum(f) over (partition by si order by d) from T;




To find the record 3 ahead of the current record, using a window between the current row and 3 records ahead and a default value of 0.

 A = load 'T';
 B = group A by si;
 C = foreach B {
     C1 = order A by i;
     generate flatten(Stitch(C1, Over(C1.i, 'lead', 0, 3, 3, 0)));
 }
 D = foreach C generate s, $9;


This is equivalent to the SQL statement

select s, lead(i, 3, 0) over (partition by si order by i rows between current row and 3 following) over T;