猪查询排序

时间:2012-02-03 07:18:59

标签: group-by sql-order-by apache-pig

grunt> dump jn;

(k1,k4,10)
(k1,k5,15)
(k2,k4,9)
(k3,k4,16)

grunt> jn = group jn by $1;
grunt> dump jn;


(k4,{(k1,k4,10),(k2,k4,9),(k3,k4,16)})
(k5,{(k1,k5,15)})

现在,从这里我想要以下输出:

(k4,{(k3,k4,16),(k1,k4,10)})
(k5,{(k1,k5,15)})

基本上,我想对数字进行排序:10,9,16并为每一行选择前2位 我该怎么做?

1 个答案:

答案 0 :(得分:9)

这与此question类似,您可以使用Nested FOREACH,例如:

A = LOAD 'data';
jn = group A by $1;
B = FOREACH jn {
  sorted = ORDER A by $2 ASC;
  lim = LIMIT sorted 2;
  GENERATE lim;
};
DUMP B;