由于某种原因,在下面的语句中添加过滤器会导致一些错误。在控制台输出中,我找到Failed to read data from "..."
。在日志中我发现了这个:
Backend error message
---------------------
java.lang.NullPointerException
at org.apache.pig.builtin.Utf8StorageConverter.consumeTuple(Utf8StorageConverter.java:185)
at org.apache.pig.builtin.Utf8StorageConverter.consumeBag(Utf8StorageConverter.java:94)
at org.apache.pig.builtin.Utf8StorageConverter.bytesToBag(Utf8StorageConverter.java:331)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POCast.getNext(POCast.java:1562)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:334)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.processInput(POUserFunc.java:228)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:282)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:416)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POCast.getNext(POCast.java:3
Pig Stack Trace
---------------
ERROR 1066: Unable to open iterator for alias limited
org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for alias limited
at org.apache.pig.PigServer.openIterator(PigServer.java:838)
at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:696)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:320)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:194)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:170)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
at org.apache.pig.Main.run(Main.java:604)
at org.apache.pig.Main.main(Main.java:157)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
Caused by: java.io.IOException: Couldn't retrieve job.
at org.apache.pig.PigServer.store(PigServer.java:902)
at org.apache.pig.PigServer.openIterator(PigServer.java:813)
... 12 more
我正在使用的代码如下:
--- Read the input
records = LOAD 'data' AS (id1, id2, link, tags:bag{}, dates);
counted = FOREACH records GENERATE (chararray) id1, (int) COUNT(tags) as amountOfTags;
filtered = FILTER counted BY amountOfTags > 0;
limited = limit filtered 10;
--- Save the result
dump limited;
一切正常,直到我添加filtered...
行并尝试输出它。
谁能告诉我为什么?