Oozie和作业历史服务器配置问题

时间:2016-01-12 08:05:01

标签: hadoop yarn hadoop2 cloudera-cdh hue

问题

我正在尝试在不使用CDM的情况下安装psuedo-distributed CDH。一切都通过控制台“工作”。但是,第二个我开始使用Hue,我在尝试使用Pig时收到错误。

Hue中显示的错误是:

  

JA017:无法查找已启动的hadoop作业ID   [job_local2125047777_0001]与行动相关联   [0000000-160112011607704-Oozie的-oozi-W @猪。没有采取行动!

我认为这是一个由于Oozie工作流程问题导致错误传达而导致的错误,该问题是将Pig连接到作业历史记录服务器。

在此之前,我无法使用Hue的Hive,因为Oozie难以在HDFS上为Oozie安装sharelib。我通过在/etc/hadoop/conf/core-site.xml/etc/oozie/conf/hadoop-conf/core-site.xml之间创建符号链接来解决此问题。正如这里建议的那样:Apache Oozie failed loading ShareLib

脚本信息

我编写的将CDH安装到Scientific Linux 7上的配置脚本可在此处找到:https://github.com/coatless/stat490uiuc/blob/master/install_scripts/cdh_build.sh

具体来说,我试图从猪脚本中获得结果:

data = LOAD '/user/hue/pig/examples/data/midsummer.txt' as (text:CHARARRAY);

upper_case = FOREACH data GENERATE org.apache.pig.piggybank.evaluation.string.UPPER(text);

STORE upper_case INTO '$output' ;

尝试解决方案

从谷歌搜索,我遇到了以下解决方案,一旦实施,还没有解决。

建议运行以下命令:

sudo -u hdfs hadoop fs -mkdir -p /user/history
sudo -u hdfs hadoop fs -chmod -R 1777 /user/history
sudo -u hdfs hadoop fs -chown mapred:hadoop /user/history

重启资源&节点管理器,HDFS和历史服务器无济于事。

在该主题中,有另一位用户建议在job.properties中设置指定user.name=mapred的属性。但是,我找不到对Hue作业的job.properties的任何引用。

这篇文章建议在mapred-site.xml文件中声明历史服务器的固定路径:

<property>
  <name>mapreduce.jobhistory.done-dir</name>
  <value>/user/history/done</value>
</property>
<property>
   <name>mapreduce.jobhistory.intermediate-done-dir</name>
   <value>/user/history/done_intermediate</value>
</property>

这也行不通。

表示问题可能与权限问题有关,但是,用户未提供有关问题解决方法的详细信息。

任何帮助都将不胜感激。

完整的oozie日志

oozie.log文件中的完整错误文本:

2016-01-11 23:51:59,195  WARN ParameterVerifier:523 - SERVER[server-name] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[-] ACTION[-] The application does not define formal parameters in its XML definition
2016-01-11 23:51:59,275  WARN LiteWorkflowAppService:523 - SERVER[server-name] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[-] ACTION[-] libpath [hdfs://localhost:8020/user/hue/oozie/workspaces/_cloudera_-oozie-1-1452577913.73/lib] does not exist
2016-01-11 23:51:59,572  INFO ActionStartXCommand:520 - SERVER[server-name] USER[cloudera] GROUP[-] TOKEN[] APP[pig-app-hue-script] JOB[0000000-160111235108256-oozie-oozi-W] ACTION[0000000-160111235108256-oozie-oozi-W@:start:] Start action [0000000-160111235108256-oozie-oozi-W@:start:] with user-retry state : userRetryCount [0], userRetryMax [0], userRetryInterval [10]
2016-01-11 23:51:59,595  INFO ActionStartXCommand:520 - SERVER[server-name] USER[cloudera] GROUP[-] TOKEN[] APP[pig-app-hue-script] JOB[0000000-160111235108256-oozie-oozi-W] ACTION[0000000-160111235108256-oozie-oozi-W@:start:] [***0000000-160111235108256-oozie-oozi-W@:start:***]Action status=DONE
2016-01-11 23:51:59,596  INFO ActionStartXCommand:520 - SERVER[server-name] USER[cloudera] GROUP[-] TOKEN[] APP[pig-app-hue-script] JOB[0000000-160111235108256-oozie-oozi-W] ACTION[0000000-160111235108256-oozie-oozi-W@:start:] [***0000000-160111235108256-oozie-oozi-W@:start:***]Action updated in DB!
2016-01-11 23:52:00,052  INFO ActionStartXCommand:520 - SERVER[server-name] USER[cloudera] GROUP[-] TOKEN[] APP[pig-app-hue-script] JOB[0000000-160111235108256-oozie-oozi-W] ACTION[0000000-160111235108256-oozie-oozi-W@pig] Start action [0000000-160111235108256-oozie-oozi-W@pig] with user-retry state : userRetryCount [0], userRetryMax [0], userRetryInterval [10]
2016-01-11 23:52:03,487  WARN Credentials:96 - SERVER[server-name] Null token ignored for oozie mr token
2016-01-11 23:52:03,506  WARN Credentials:96 - SERVER[server-name] Null token ignored for oozie mr token
2016-01-11 23:52:03,562  WARN JobResourceUploader:64 - SERVER[server-name] Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
2016-01-11 23:52:03,563  WARN JobResourceUploader:171 - SERVER[server-name] No job jar file set.  User classes may not be found. See Job or Job#setJar(String).
2016-01-11 23:52:04,169  WARN MRApps:582 - SERVER[server-name] cache file (mapreduce.job.cache.files) hdfs://localhost:8020/user/oozie/share/lib/lib_20160111222734/pig/json-simple-1.1.jar conflicts with cache file (mapreduce.job.cache.files) hdfs://localhost:8020/user/oozie/share/lib/lib_20160111222734/oozie/json-simple-1.1.jar This will be an error in Hadoop 2.0
2016-01-11 23:52:08,611  WARN Credentials:96 - SERVER[server-name] Null token ignored for oozie mr token
2016-01-11 23:52:08,618  WARN PigActionExecutor:523 - SERVER[server-name] USER[cloudera] GROUP[-] TOKEN[] APP[pig-app-hue-script] JOB[0000000-160111235108256-oozie-oozi-W] ACTION[0000000-160111235108256-oozie-oozi-W@pig] Exception in check(). Message[JA017: Could not lookup launched hadoop Job ID [job_local1961106749_0001] which was associated with  action [0000000-160111235108256-oozie-oozi-W@pig].  Failing this action!]
org.apache.oozie.action.ActionExecutorException: JA017: Could not lookup launched hadoop Job ID [job_local1961106749_0001] which was associated with  action [0000000-160111235108256-oozie-oozi-W@pig].  Failing this action!
       at org.apache.oozie.action.hadoop.JavaActionExecutor.check(JavaActionExecutor.java:1274)
       at org.apache.oozie.action.hadoop.JavaActionExecutor.start(JavaActionExecutor.java:1203)
       at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:250)
       at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:64)
       at org.apache.oozie.command.XCommand.call(XCommand.java:286)
       at org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:321)
       at org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:250)
       at org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:175)
       at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       at java.lang.Thread.run(Thread.java:745)
2016-01-11 23:52:08,620  WARN ActionStartXCommand:523 - SERVER[server-name] USER[cloudera] GROUP[-] TOKEN[] APP[pig-app-hue-script] JOB[0000000-160111235108256-oozie-oozi-W] ACTION[0000000-160111235108256-oozie-oozi-W@pig] Error starting action [pig]. ErrorType [FAILED], ErrorCode [JA017], Message [JA017: Could not lookup launched hadoop Job ID [job_local1961106749_0001] which was associated with  action [0000000-160111235108256-oozie-oozi-W@pig].  Failing this action!]
org.apache.oozie.action.ActionExecutorException: JA017: Could not lookup launched hadoop Job ID [job_local1961106749_0001] which was associated with  action [0000000-160111235108256-oozie-oozi-W@pig].  Failing this action!
       at org.apache.oozie.action.hadoop.JavaActionExecutor.check(JavaActionExecutor.java:1274)
       at org.apache.oozie.action.hadoop.JavaActionExecutor.start(JavaActionExecutor.java:1203)
       at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:250)
       at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:64)
       at org.apache.oozie.command.XCommand.call(XCommand.java:286)
       at org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:321)
       at org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:250)
       at org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:175)
       at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       at java.lang.Thread.run(Thread.java:745)
2016-01-11 23:52:08,621  WARN ActionStartXCommand:523 - SERVER[server-name] USER[cloudera] GROUP[-] TOKEN[] APP[pig-app-hue-script] JOB[0000000-160111235108256-oozie-oozi-W] ACTION[0000000-160111235108256-oozie-oozi-W@pig] Failing Job due to failed action [pig]
2016-01-11 23:52:08,623  WARN LiteWorkflowInstance:523 - SERVER[server-name] USER[cloudera] GROUP[-] TOKEN[] APP[pig-app-hue-script] JOB[0000000-160111235108256-oozie-oozi-W] ACTION[0000000-160111235108256-oozie-oozi-W@pig] Workflow Failed. Failing node [pig]
2016-01-11 23:52:08,768  INFO KillXCommand:520 - SERVER[server-name] USER[cloudera] GROUP[-] TOKEN[] APP[pig-app-hue-script] JOB[0000000-160111235108256-oozie-oozi-W] ACTION[] STARTED WorkflowKillXCommand for jobId=0000000-160111235108256-oozie-oozi-W
2016-01-11 23:52:08,806  INFO KillXCommand:520 - SERVER[server-name] USER[cloudera] GROUP[-] TOKEN[] APP[pig-app-hue-script] JOB[0000000-160111235108256-oozie-oozi-W] ACTION[] ENDED WorkflowKillXCommand for jobId=0000000-160111235108256-oozie-oozi-W
2016-01-11 23:52:09,038  INFO CallbackServlet:520 - SERVER[server-name] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000000-160111235108256-oozie-oozi-W] ACTION[0000000-160111235108256-oozie-oozi-W@pig] callback for action [0000000-160111235108256-oozie-oozi-W@pig]
2016-01-11 23:52:09,072 ERROR CompletedActionXCommand:517 - SERVER[server-name] USER[-] GROUP[-] TOKEN[] APP[-] JOB[0000000-160111235108256-oozie-oozi-W] ACTION[0000000-160111235108256-oozie-oozi-W@pig] XException,
org.apache.oozie.command.CommandException: E0800: Action it is not running its in [FAILED] state, action [0000000-160111235108256-oozie-oozi-W@pig]
       at org.apache.oozie.command.wf.CompletedActionXCommand.eagerVerifyPrecondition(CompletedActionXCommand.java:92)
       at org.apache.oozie.command.XCommand.call(XCommand.java:257)
       at org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:175)
       at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       at java.lang.Thread.run(Thread.java:745)
2016-01-11 23:52:09,082  WARN CallableQueueService$CallableWrapper:523 - SERVER[server-name] USER[-] GROUP[-] TOKEN[] APP[-] JOB[0000000-160111235108256-oozie-oozi-W] ACTION[0000000-160111235108256-oozie-oozi-W@pig] exception callable [callback], E0800: Action it is not running its in [FAILED] state, action [0000000-160111235108256-oozie-oozi-W@pig]
org.apache.oozie.command.CommandException: E0800: Action it is not running its in [FAILED] state, action [0000000-160111235108256-oozie-oozi-W@pig]
       at org.apache.oozie.command.wf.CompletedActionXCommand.eagerVerifyPrecondition(CompletedActionXCommand.java:92)
       at org.apache.oozie.command.XCommand.call(XCommand.java:257)
       at org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:175)
       at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       at java.lang.Thread.run(Thread.java:745)

1 个答案:

答案 0 :(得分:0)

您应该使用HUE文件浏览器双重检查是否所有权限在/ user / history的所有目录和子目录中都是正确的。

在我的情况下,所有用户都拥有/ user / history的所有子文件夹的权限,但HUE文件浏览器告诉我'/ user / history'目录本身具有以下权限集:

Name        User     Group     Permissions
history     mapred   hadoop    drwxrwx--- 

使用与mapred不同的用户时会出错。 以下命令有帮助:

sudo -u hdfs hadoop fs -chmod 777 /user/history