如何在pig中加载带有空字段的分隔文件

时间:2013-06-30 19:07:12

标签: apache-pig

我使用以下命令加载文件,当我尝试转储或说明加载的数据时,它失败并出现以下错误。我检查了数据的完整性,每行包含正确数量的分隔符,但是当字段为空时,分隔符紧随其后,我尝试加载下面的单个样本行。它不起作用。

hs_2_inr = LOAD 'hs_2_inr.dat' USING PigStorage('^') as ( year:chararray, country:chararray, s_no:chararray, hs_8:chararray, hs_8_desc:chararray, prevyr_inr:chararray, curyr_inr:chararray, growth:chararray, dummy:chararray);

以下是示例数据

1997^BOTSWANA^1.^10063001^*RICE PARBOILED^^2.43^^

以下是例外

2013-06-30 21:02:23,015 [main] ERROR org.apache.pig.pen.AugmentBaseDataVisitor - No (valid) input data found!
java.lang.RuntimeException: No (valid) input data found!
    at org.apache.pig.pen.AugmentBaseDataVisitor.visit(AugmentBaseDataVisitor.java:583)
    at org.apache.pig.newplan.logical.relational.LOLoad.accept(LOLoad.java:229)
    at org.apache.pig.pen.util.PreOrderDepthFirstWalker.depthFirst(PreOrderDepthFirstWalker.java:82)
    at org.apache.pig.pen.util.PreOrderDepthFirstWalker.depthFirst(PreOrderDepthFirstWalker.java:84)
    at org.apache.pig.pen.util.PreOrderDepthFirstWalker.walk(PreOrderDepthFirstWalker.java:66)
    at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:52)
    at org.apache.pig.pen.ExampleGenerator.getExamples(ExampleGenerator.java:180)
    at org.apache.pig.PigServer.getExamples(PigServer.java:1180)
    at org.apache.pig.tools.grunt.GruntParser.processIllustrate(GruntParser.java:739)
    at org.apache.pig.tools.pigscript.parser.PigScriptParser.Illustrate(PigScriptParser.java:626)
    at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:323)
    at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:194)
    at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:170)
    at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69)
    at org.apache.pig.Main.run(Main.java:538)
    at org.apache.pig.Main.main(Main.java:157)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
2013-06-30 21:02:23,016 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2997: Encountered IOException. Exception 

那么如何在猪中加载空字段的文件?

1 个答案:

答案 0 :(得分:3)

您的代码运行正常。正如您在评论中提到的,ILLUSTRATE是您的问题。根据{{​​3}},ILLUSTRATE暂时没有维护。不要依赖它。无论如何,您不应该在任何非诊断代码中使用它。请改用DESCRIBE

docs中,ILLUSTRATE上的警告似乎消失了,所以它可能再次安全,但我仍然更依赖DESCRIBE来避免潜在的问题。在我正在使用的Pig 0.10中,ILLUSTRATE仍然给了我同样的错误。