Pig Java UDF:从元组生成一个包

时间:2018-03-18 15:33:25

标签: apache-pig user-defined-functions

我希望有人可以帮我创建一个java UDF,它将把这个输入传播到三个文本文件中:

answer

并返回以下输出包:

Montreal, 5 3 10 9 8
Toronto, 7 2 2 3 4 4
Edmonton, 3 3 1 1 7
Montreal, 2 2 9

我对java很新,非常感谢您提供的任何帮助。谢谢。

1 个答案:

答案 0 :(得分:0)

如果您使用猪0.14或之后支持STRSPLITTOBAG,那么

A = load 'test.input' using PigStorage(',') as (place:chararray, numbers:chararray);
B = FOREACH A GENERATE place, FLATTEN(STRSPLITTOBAG(numbers)) as number;
C = FOREACH B GENERATE place, (chararray) number;
D = GROUP C by place;
E = FOREACH D generate C; -- dropping group field
dump E;

输出

({(Toronto,2),(Toronto,2),(Toronto,7),(Toronto,4),(Toronto,4),(Toronto,3)})
({(Edmonton,7),(Edmonton,1),(Edmonton,1),(Edmonton,3),(Edmonton,3)})
({(Montreal,9),(Montreal,2),(Montreal,2),(Montreal,8),(Montreal,9),(Montreal,10),(Montreal,3),(Montreal,5)})