我是OOZIE
的新手,并尝试使用PIG
工作流程运行OOZIE
脚本。下面是名为first.pig
的猪脚本:
A = LOAD '/user/jas/pigip' USING PigStorage(',');
B = FOREACH A GENERATE $0;
STORE B INTO '/user/jas/pigop';
以下是workflow.xml
:
<workflow-app xmlns="uri:oozie:workflow:0.2" name="PIGACTION">
<start to="pig-node"/>
<action name="pig-node">
<pig>
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<script>first.pig</script>
</pig>
<ok to="end"/>
<error to="fail"/>
</action>
<kill name="fail">
<message>Pig Script failed!!!</message>
</kill>
<end name="end"/>
</workflow-app>
以下是job.properties
:
nameNode=hdfs://localhost:8020
jobTracker=localhost:8021
queueName=default
oozie.libpath=/user/oozie/shared/lib
oozie.wf.application.path=${nameNode}/user/my/pigact
运行工作流程:
1)我在first.pig
workflow.xml
和/user/my/pigact
2)输入文件(简单的CSV)上传到路径/user/jas/pigip
为了运行作业,下面是我使用的命令:
oozie job -oozie http://localhost:11000/oozie -config job.properties -auth SIMPLE -run
工作已提交,然后被杀死。以下是Job Log
:
2016-03-02 11:35:01,463 INFO ActionStartXCommand:539 - USER[training] GROUP[-] TOKEN[] APP[PIGACTION] JOB[0000009-160301223816814-oozie-oozi-W] ACTION[0000009-160301223816814-oozie-oozi-W@:start:] Start action [0000009-160301223816814-oozie-oozi-W@:start:] with user-retry state : userRetryCount [0], userRetryMax [0], userRetryInterval [10]
2016-03-02 11:35:01,463 WARN ActionStartXCommand:542 - USER[training] GROUP[-] TOKEN[] APP[PIGACTION] JOB[0000009-160301223816814-oozie-oozi-W] ACTION[0000009-160301223816814-oozie-oozi-W@:start:] [***0000009-160301223816814-oozie-oozi-W@:start:***]Action status=DONE
2016-03-02 11:35:01,464 WARN ActionStartXCommand:542 - USER[training] GROUP[-] TOKEN[] APP[PIGACTION] JOB[0000009-160301223816814-oozie-oozi-W] ACTION[0000009-160301223816814-oozie-oozi-W@:start:] [***0000009-160301223816814-oozie-oozi-W@:start:***]Action updated in DB!
2016-03-02 11:35:01,524 INFO ActionStartXCommand:539 - USER[training] GROUP[-] TOKEN[] APP[PIGACTION] JOB[0000009-160301223816814-oozie-oozi-W] ACTION[0000009-160301223816814-oozie-oozi-W@pig-node] Start action [0000009-160301223816814-oozie-oozi-W@pig-node] with user-retry state : userRetryCount [0], userRetryMax [0], userRetryInterval [10]
2016-03-02 11:35:02,411 WARN PigActionExecutor:542 - USER[training] GROUP[-] TOKEN[] APP[PIGACTION] JOB[0000009-160301223816814-oozie-oozi-W] ACTION[0000009-160301223816814-oozie-oozi-W@pig-node] credentials is null for the action
2016-03-02 11:35:04,170 INFO PigActionExecutor:539 - USER[training] GROUP[-] TOKEN[] APP[PIGACTION] JOB[0000009-160301223816814-oozie-oozi-W] ACTION[0000009-160301223816814-oozie-oozi-W@pig-node] checking action, external ID [job_201603012236_0015] status [RUNNING]
2016-03-02 11:35:04,293 WARN ActionStartXCommand:542 - USER[training] GROUP[-] TOKEN[] APP[PIGACTION] JOB[0000009-160301223816814-oozie-oozi-W] ACTION[0000009-160301223816814-oozie-oozi-W@pig-node] [***0000009-160301223816814-oozie-oozi-W@pig-node***]Action status=RUNNING
2016-03-02 11:35:04,295 WARN ActionStartXCommand:542 - USER[training] GROUP[-] TOKEN[] APP[PIGACTION] JOB[0000009-160301223816814-oozie-oozi-W] ACTION[0000009-160301223816814-oozie-oozi-W@pig-node] [***0000009-160301223816814-oozie-oozi-W@pig-node***]Action updated in DB!
2016-03-02 11:35:12,210 INFO CallbackServlet:539 - USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000009-160301223816814-oozie-oozi-W] ACTION[0000009-160301223816814-oozie-oozi-W@pig-node] callback for action [0000009-160301223816814-oozie-oozi-W@pig-node]
2016-03-02 11:35:12,409 INFO PigActionExecutor:539 - USER[training] GROUP[-] TOKEN[] APP[PIGACTION] JOB[0000009-160301223816814-oozie-oozi-W] ACTION[0000009-160301223816814-oozie-oozi-W@pig-node] action completed, external ID [job_201603012236_0015]
2016-03-02 11:35:12,434 WARN PigActionExecutor:542 - USER[training] GROUP[-] TOKEN[] APP[PIGACTION] JOB[0000009-160301223816814-oozie-oozi-W] ACTION[0000009-160301223816814-oozie-oozi-W@pig-node] Launcher ERROR, reason: Main class [org.apache.oozie.action.hadoop.PigMain], exit code [2]
2016-03-02 11:35:12,705 INFO ActionEndXCommand:539 - USER[training] GROUP[-] TOKEN[] APP[PIGACTION] JOB[0000009-160301223816814-oozie-oozi-W] ACTION[0000009-160301223816814-oozie-oozi-W@pig-node] ERROR is considered as FAILED for SLA
2016-03-02 11:35:12,794 INFO ActionStartXCommand:539 - USER[training] GROUP[-] TOKEN[] APP[PIGACTION] JOB[0000009-160301223816814-oozie-oozi-W] ACTION[0000009-160301223816814-oozie-oozi-W@fail] Start action [0000009-160301223816814-oozie-oozi-W@fail] with user-retry state : userRetryCount [0], userRetryMax [0], userRetryInterval [10]
2016-03-02 11:35:12,795 WARN ActionStartXCommand:542 - USER[training] GROUP[-] TOKEN[] APP[PIGACTION] JOB[0000009-160301223816814-oozie-oozi-W] ACTION[0000009-160301223816814-oozie-oozi-W@fail] [***0000009-160301223816814-oozie-oozi-W@fail***]Action status=DONE
2016-03-02 11:35:12,795 WARN ActionStartXCommand:542 - USER[training] GROUP[-] TOKEN[] APP[PIGACTION] JOB[0000009-160301223816814-oozie-oozi-W] ACTION[0000009-160301223816814-oozie-oozi-W@fail] [***0000009-160301223816814-oozie-oozi-W@fail***]Action updated in DB!
2016-03-02 11:35:12,904 WARN CoordActionUpdateXCommand:542 - USER[training] GROUP[-] TOKEN[] APP[PIGACTION] JOB[0000009-160301223816814-oozie-oozi-W] ACTION[-] E1100: Command precondition does not hold before execution, [, coord action is null], Error Code: E1100
请说明出了什么问题,我无法确定问题。
修改
捕捉快照:
第3张图像(突出显示)的控制台URL显示第4张图像。请建议。
这是我得到的日志:
>>> Invoking Pig command line now >>>
Run pig script using PigRunner.run() for Pig version 0.8+
Apache Pig version 0.10.0-cdh4.1.1 (rexported)
compiled Oct 16 2012, 10:15:57
Run pig script using PigRunner.run() for Pig version 0.8+
1196 [main] INFO org.apache.pig.Main - Apache Pig version 0.10.0-cdh4.1.1 (rexported) compiled Oct 16 2012, 10:15:57
1199 [main] INFO org.apache.pig.Main - Logging error messages to: /var/lib/hadoop-hdfs/cache/mapred/mapred/local/taskTracker/training/jobcache/job_201603012236_0030/attempt_201603012236_0030_m_000000_0/work/pig-job_201603012236_0030.log
1463 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://localhost:8020
1472 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: localhost:8021
1674 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. name
<<< Invocation of Main class completed <<<
Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.PigMain], exit code [2]
Oozie Launcher failed, finishing Hadoop job gracefully
Oozie Launcher ends
stderr logs
Details at logfile: /var/lib/hadoop-hdfs/cache/mapred/mapred/local/taskTracker/training/jobcache/job_201603012236_0030/attempt_201603012236_0030_m_000000_0/work/pig-job_201603012236_0030.log
Pig logfile dump:
Pig Stack Trace
---------------
ERROR 2998: Unhandled internal error. name
java.lang.NoSuchFieldError: name
at org.apache.pig.parser.QueryParserStringStream.<init>(QueryParserStringStream.java:32)
at org.apache.pig.parser.QueryParserDriver.tokenize(QueryParserDriver.java:198)
at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:166)
at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1594)
at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1545)
at org.apache.pig.PigServer.registerQuery(PigServer.java:545)
at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:970)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:386)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:189)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
at org.apache.pig.Main.run(Main.java:430)
at org.apache.pig.PigRunner.run(PigRunner.java:49)
at org.apache.oozie.action.hadoop.PigMain.runPigJob(PigMain.java:282)
at org.apache.oozie.action.hadoop.PigMain.run(PigMain.java:222)
at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:37)
at org.apache.oozie.action.hadoop.PigMain.main(PigMain.java:76)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:472)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:393)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:327)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
================================================================================
Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.PigMain], exit code [2]
答案 0 :(得分:2)
查看QueryParserStringStream.java
,它扩展了ANTLRStringStream。现在如果缺少antlr,你应该得到ClassNotFoundError而不是NoSuchFiledError。我可以想到的唯一可能的原因是在运行时类路径中存在不同版本的antlr,这会导致此错误。
答案 1 :(得分:1)
您看到的错误是oozie启动程序映射器作业的失败结果,该作业带有您的pig脚本来收集资源并在您的集群上实际运行它。它基本上模仿了命令行的grunt功能。
如果您有可用的oozie Web控制台,则可以单击工作流作业上的已终止操作,该操作将显示作业URL。作业URL实际指向您的oozie启动器映射器作业,您可以检查其日志以了解出错的地方。
答案 2 :(得分:0)
执行pig脚本时出错。以下是日志中的错误:
2016-03-02 11:35:12,434 WARN PigActionExecutor:542 - USER [training] GROUP [ - ] TOKEN [] APP [PIGACTION] JOB [0000009-160301223816814-oozie-oozi-W]行动[0000009-160301223816814 -oozie-oozi-W @ pig-node] Launcher ERROR,原因:主类[org.apache.oozie.action.hadoop.PigMain],退出代码[2]
我没有在脚本中看到任何明显的问题,您是否可以在MapReduce模式下运行脚本并检查是否有任何错误?
pig -x mapreduce first.pig