如何将猪袋分组

时间:2016-07-19 11:06:23

标签: hadoop apache-pig

首先我有数据和我组

A = LOAD './test.txt' USING PigStorage(' ') AS (id:int, time:int, value:float);

B = GROUP A BY time;

例如结果我有这样的结构。

1001    {(1,1001,0.2),(3,1001,0.3),(2,1001,0.3),(4,1001,0.6)}   
1002    {(2,1002,0.5),(1,1002,0.3),(3,1002,0.1),(4,1002,0.6)}  
1003    {(4,1003,0.2),(1,1003,0.8),(2,1003,0.4),(3,1003,0.5)}

但我想要

1001     {(1,1001,0.2),(2,1001,0.3),(3,1001,0.3),(4,1001,0.6)}
1002     {(1,1002,0.3),(2,1002,0.5),(3,1002,0.1),(4,1002,0.6)}   
1003     {(1,1003,0.8),(2,1003,0.4),(3,1003,0.5),(4,1003,0.2)}

1 个答案:

答案 0 :(得分:0)

使用NESTED FOREACH

C = FOREACH B {
          sort_by_id = ORDER A BY id;
          GENERATE group, sort_by_id ;
              };