猪嵌套示例

时间:2015-07-01 17:42:44

标签: apache-pig

我写了以下猪脚本。我怎样才能把它作为嵌套的呢?

input= LOAD '/path/to/input/data' USING PigStorage('\t') AS (id:chararray,category:chararray);

grp= GROUP input BY category;

grp_count= FOREACH grp generate group, COUNT(input);

grp_ordered= order grp_count by $1 DESC;

top_grp= LIMIT grp_ordered 5; 

2 个答案:

答案 0 :(得分:1)

非常简单 - 看看grp_count关系:

input= LOAD '/path/to/input/data' USING PigStorage('\t') AS (id:chararray,category:chararray);

grp_count= FOREACH (GROUP input BY category) 
           generate flatten(group) as category
           ,COUNT(input) as cnt;

grp_ordered= order grp_count by $1 DESC;

top_grp= LIMIT grp_ordered 5; 

答案 1 :(得分:0)

如果我理解你的问题,下面是代码。

data = LOAD 'data' USING PigStorage() AS (id,category);
grp = GROUP data BY category;

grp_count = FOREACH grp {

                  ord = order data by $1 DESC ;
                  top_grp = LIMIT ord 5;
           GENERATE flatten(group),COUNT(top_grp.$1) ; };

dump grp_count;