我的数据集结构如下:{movie:chararry, year:int, weight:float, actor:chararray}
我试图找出每年权重最高的举动。所以我按年份和电影分组,然后我得到了以下一组:
{group: (year:int, movie:chararray), movies:{(movie:chararry, year:int, weight:float, actor:chararray)}}
我的问题是如何根据行李箱值,重量?对货架进行分类?谢谢
答案 0 :(得分:3)
您可以在FOREACH中使用嵌套语句。
inpt = load '...../data.csv' using PigStorage(',') as (movie:chararry, year:int, weight:float, actor:chararray);
grp = group inpt by (year, movie);
srt = foreach grp {
by_wright = ORDER inpt BY weight;
generate group, by_wright;
};