PIG中行的单列值

时间:2015-10-07 18:22:10

标签: apache-pig

我有这样的数据:

1, 0, 0
0, 1, 0
0, 0, 1

需要输出:

1, 1, 1

猪怎么办?

1 个答案:

答案 0 :(得分:0)

输入

1, 0, 0
0, 1, 0
0, 0, 1

只需在每行中创建一个具有相同值的新变量,并使用该键应用分组,并为每个变量选择MAX ..

records = LOAD '/user/cloudera/records.txt' USING PigStorage(',') AS (c1:int,c2:int,c3:int);

records_each = FOREACH records GENERATE 'KEY' as grouping_key, c1, c2, c3;

records_grp  = GROUP records_each BY grouping_key;

records_grp_each = FOREACH records_grp GENERATE MAX(records_each.c1) as c1, MAX(records_each.c2) as c2, MAX(records_each.c3) as c3;

输出:

 (1,1,1)