在名称为vaidateUser
的猪中编写了一个自定义UDF,用于验证用户名。
public class ValidateUser extends FilterFunc {
public Boolean exec(Tuple tuple) throws IOException {
// custom validation code
}
}
该类是默认包的一部分,属于pig_udfs.jar
。
此JAR用于猪脚本:validateUsers.pig
REGISTER 'pig_udfs.jar';
users = load 'users.txt' using PigStorage(',') as (user:chararray);
validUsers = filter users by ValidateUser(user);
dump validUsers;
尝试使用以下方法执行脚本:
pig -x local validateusers.pig
如下所示获取错误,任何有关解决此问题的意见/建议都将受到赞赏!
Pig Stack Trace:
ERROR 1003: Unable to find an operator for alias fileterd
org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1003: Unable to find an operator for alias fileterd
at org.apache.pig.PigServer.openIterator(PigServer.java:732)
at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:615)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:303)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:144)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:90)
at org.apache.pig.Main.run(Main.java:500)
at org.apache.pig.Main.main(Main.java:107)
答案 0 :(得分:0)
我没有遇到任何问题custom filter UDF
及其工作正常,你能试试吗?
在下面的示例中,我将过滤掉所有不等于"test"
的名称。
<强> users.txt 强>
test
mike
test
john
<强> PigScript:强>
REGISTER 'pig_udfs.jar';
users = load 'users.txt' using PigStorage(',') as (user:chararray);
validUsers = filter users by ValidateUser(user);
dump validUsers;
<强> ValidateUser.java 强>
import java.io.IOException;
import org.apache.pig.FilterFunc;
import org.apache.pig.data.Tuple;
public class ValidateUser extends FilterFunc {
@Override
public Boolean exec(Tuple input) throws IOException {
try {
String str = (String)input.get(0);
return (!str.equals("test"));
}
catch (IOException ee) {
throw ee;
}
}
}
<强>输出:强>
(john)
(mike)
确保您已在类路径中设置了piggybank.jar
> javac ValidateUser.java
> jar -cvf pig_udfs.jar ValidateUser.class
> pig -x local validateusers.pig