考虑这个脚本:
register udf-1.0.0-BETA.jar
A = LOAD '1.txt' USING PigStorage('\t') as (key1:chararray, val1:chararray);
B = LOAD '2.txt' USING PigStorage('\t') as (key2:chararray, val2:chararray);
joined = JOIN A by key1, B by key2;
out = FOREACH joined GENERATE com.example.UDF();
dump out;
就像这样,我的UDF只获取密钥。如果我试试这个:
out = FOREACH joined GENERATE com.example.UDF(joined);
我遇到了异常需要从关系中投射一个列,以便将其用作标量
我可以像这样传递整个关系
out = FOREACH joined GENERATE com.example.UDF(A::key1, A::val1, B::key2, B::val2);
但这是冗长的。有更简单的方法吗?
答案 0 :(得分:3)
是的,请尝试以下方法:
out = FOREACH joined GENERATE com.example.UDF(*);