答案 0 :(得分:0)
这是一个天真的解决方案。
使用RANK
和GROUP
以及FILTER
按三个条件中的每一个对每三行进行分组。
A = Load '/path_to_data/data' as (c1 : chararray);
B = RANK A;
C = FOREACH B GENERATE (rank_A+2)/3 as id, c1;
D = FOREACH (GROUP C BY id) {
ONE = FILTER C BY c1 matches 'One:.*';
TWO = FILTER C BY c1 matches 'Two:.*';
THREE = FILTER C BY c1 matches 'Three:.*';
GENERATE
group as id
, FLATTEN(ONE.c1) as c1_one
, FLATTEN(TWO.c1) as c1_two
, FLATTEN(THREE.c1) as c1_three
;
};
DUMP D;
(1,One:"A",Two:"2",Three:"last")
(2,One:"B",Two:"1",Three:"first")