尝试转储表时出现Pig错误

时间:2014-06-12 19:02:51

标签: hadoop apache-pig

我试图运行一个非常简单的猪脚本并继续遇到并发症。

SCRIPT:

log = LOAD 'C:/Users/malanio/Documents/test.log' USING PigStorage(',') AS (user:chararray, some:long, some2:chararray);
DUMP log;

我正在加载的文件:

ravi,1,1

发生以下错误:

C:\Users\malanio\Documents>pig -x local testrun.pig
2014-06-12 14:46:22,939 [main] INFO  org.apache.pig.Main - Apache Pig version 0.12.1 (r1585011) compiled Apr 05 2014, 01:41:34
2014-06-12 14:46:22,940 [main] INFO  org.apache.pig.Main - Logging error messages to: C:\hadoop-2.4.0\logs\pig_1402598782937.log
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/C:/hadoop-2.4.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/C:/pig-0.12.1/pig-0.12.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
2014-06-12 14:46:23,616 [main] INFO  org.apache.pig.impl.util.Utils - Default bootup file C:\Users\malanio/.pigbootup not found
2014-06-12 14:46:23,702 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2014-06-12 14:46:23,702 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2014-06-12 14:46:23,704 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: file:///
2014-06-12 14:46:24,275 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: UNKNOWN
2014-06-12 14:46:24,317 [main] INFO  org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, GroupByConstParallelSetter, LimitOptimizer, LoadTypeCastInserter, MergeFilter, MergeForEach, NewPartitionFilterOptimizer, PartitionFilterOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter, StreamTypeCastInserter], RULES_DISABLED=[FilterLogicExpressionSimplifier]}
2014-06-12 14:46:24,470 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2014-06-12 14:46:24,501 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2014-06-12 14:46:24,501 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2014-06-12 14:46:24,526 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - session.id is deprecated. Instead, use dfs.metrics.session-id
2014-06-12 14:46:24,527 [main] INFO  org.apache.hadoop.metrics.jvm.JvmMetrics - Initializing JVM Metrics with processName=JobTracker, sessionId=
2014-06-12 14:46:24,551 [main] WARN  org.apache.pig.backend.hadoop20.PigJobControl - falling back to default JobControl (not using hadoop 0.20 ?)
java.lang.NoSuchFieldException: runnerState
    at java.lang.Class.getDeclaredField(Class.java:1948)
    at org.apache.pig.backend.hadoop20.PigJobControl.<clinit>(PigJobControl.java:51)
    at org.apache.pig.backend.hadoop.executionengine.shims.HadoopShims.newJobControl(HadoopShims.java:98)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.compile(JobControlCompiler.java:289)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:191)
    at org.apache.pig.PigServer.launchPlan(PigServer.java:1324)
    at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1309)
    at org.apache.pig.PigServer.storeEx(PigServer.java:980)
    at org.apache.pig.PigServer.store(PigServer.java:944)
    at org.apache.pig.PigServer.openIterator(PigServer.java:857)
    at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:774)
    at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:372)
    at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:198)
    at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:173)
    at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
    at org.apache.pig.Main.run(Main.java:607)
    at org.apache.pig.Main.main(Main.java:156)
2014-06-12 14:46:24,569 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
2014-06-12 14:46:24,579 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - mapred.job.reduce.markreset.buffer.percent is deprecated. Instead, use mapreduce.reduce.markreset.buffer.percent
2014-06-12 14:46:24,581 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2014-06-12 14:46:24,584 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - mapred.output.compress is deprecated. Instead, use mapreduce.output.fileoutputformat.compress
2014-06-12 14:46:24,625 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2014-06-12 14:46:24,640 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Key [pig.schematuple] is false, will not generate code.
2014-06-12 14:46:24,642 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Starting process to move generated code to distributed cache
2014-06-12 14:46:24,645 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Distributed cache not supported or needed in local mode. Setting key [pig.schematuple.local.dir] with code temp directory: C:\Users\malanio\AppData\Local\Temp\1402598784640-0
2014-06-12 14:46:24,688 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
2014-06-12 14:46:24,693 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker.http.address is deprecated. Instead, use mapreduce.jobtracker.http.address
2014-06-12 14:46:24,704 [JobControl] INFO  org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
2014-06-12 14:46:24,714 [JobControl] ERROR org.apache.hadoop.mapreduce.lib.jobcontrol.JobControl - Error while trying to run jobs.
java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.setupUdfEnvAndStores(PigOutputFormat.java:225)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.checkOutputSpecs(PigOutputFormat.java:186)
    at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:458)
    at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:343)
    at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
    at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
    at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
    at org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:335)
    at org.apache.hadoop.mapreduce.lib.jobcontrol.JobControl.run(JobControl.java:240)
    at org.apache.pig.backend.hadoop20.PigJobControl.run(PigJobControl.java:121)
    at java.lang.Thread.run(Thread.java:745)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:271)
2014-06-12 14:46:24,753 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2014-06-12 14:46:24,764 [main] WARN  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to stop immediately on failure.
2014-06-12 14:46:24,767 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job null has failed! Stop running all dependent jobs
2014-06-12 14:46:24,771 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2014-06-12 14:46:24,783 [main] ERROR org.apache.pig.tools.pigstats.SimplePigStats - ERROR 2997: Unable to recreate exception from backend error: Unexpected System Error Occured: java.lang.IncompatibleClassChang
eError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.setupUdfEnvAndStores(PigOutputFormat.java:225)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.checkOutputSpecs(PigOutputFormat.java:186)
    at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:458)
    at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:343)
    at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
    at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
    at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
    at org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:335)
    at org.apache.hadoop.mapreduce.lib.jobcontrol.JobControl.run(JobControl.java:240)
    at org.apache.pig.backend.hadoop20.PigJobControl.run(PigJobControl.java:121)
    at java.lang.Thread.run(Thread.java:745)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:271)

2014-06-12 14:46:24,821 [main] ERROR org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed!
2014-06-12 14:46:24,824 [main] INFO  org.apache.pig.tools.pigstats.SimplePigStats - Detected Local mode. Stats reported below may be incomplete
2014-06-12 14:46:24,831 [main] INFO  org.apache.pig.tools.pigstats.SimplePigStats - Script Statistics:

HadoopVersion   PigVersion      UserId  StartedAt       FinishedAt      Features
2.4.0   0.12.1  malanio 2014-06-12 14:46:24     2014-06-12 14:46:24     UNKNOWN

Failed!

Failed Jobs:
JobId   Alias   Feature Message Outputs
N/A     log     MAP_ONLY        Message: Unexpected System Error Occured: java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.setupUdfEnvAndStores(PigOutputFormat.java:225)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.checkOutputSpecs(PigOutputFormat.java:186)
    at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:458)
    at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:343)
    at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
    at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
    at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
    at org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:335)
    at org.apache.hadoop.mapreduce.lib.jobcontrol.JobControl.run(JobControl.java:240)
    at org.apache.pig.backend.hadoop20.PigJobControl.run(PigJobControl.java:121)
    at java.lang.Thread.run(Thread.java:745)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:271)
    file:/tmp/temp-590289635/tmp-804647280,

Input(s):
Failed to read data from "C:/Users/malanio/Documents/test.log"

Output(s):
Failed to produce result in "file:/tmp/temp-590289635/tmp-804647280"

Job DAG:
null


2014-06-12 14:46:24,939 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed!
2014-06-12 14:46:24,952 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator for alias log
Details at logfile: C:\hadoop-2.4.0\logs\pig_1402598782937.log

当我注释掉DUMP行时,没有问题。只有在脚本尝试将其运行的数据转储为并发症时才会这样。我在最新的hadoop(2.4.0)和最新的猪(0.12.1)上本地运行脚本。我还是猪的新手,这可能只是一个简单的解释,但我似乎无法破译错误代码。我认为这可能与猪罐及其API有关。有什么建议吗?

1 个答案:

答案 0 :(得分:1)

  

当我注释掉DUMP行时,没有问题。只有当脚本试图将其运行的数据转储为并发症时才会这样做。

除非在最终结果/输出中需要,否则不会执行pig脚本中的语句。这就是“DUMP”声明没有问题的原因。这类似于懒惰评估,但存在差异。