Stack:Ambari 2.4.2.0,HDP 2.5.3.0,CentOS 6.8,FreeIPA 3.0.0
当我尝试使用 hdp 用户在纱线上提交作业时,可以成功创建并启动_000001容器,但是在创建容器后启动_000002容器时出现错误:
2018-11-27 22:13:35,919 WARN privileged.PrivilegedOperationExecutor (PrivilegedOperationExecutor.java:executePrivilegedOperation(170)) - Shell execution returned exit code: 255. Privileged Execution Operation Output:
main : command provided 1
main : run as user is hdp
main : requested yarn user is hdp
Getting exit code file...
Creating script paths...
Writing pid file...
Writing to tmp file /hadoop/yarn/local/nmPrivate/application_1543327888220_0001/container_e14_1543327888220_0001_01_000002/container_e14_1543327888220_0001_01_000002.pid.tmp
Writing to cgroup task files...
Creating local dirs...
Launching container...
Getting exit code file...
Creating script paths...
Full command array for failed execution:
[/usr/hdp/current/hadoop-yarn-nodemanager/bin/container-executor, hdp, hdp, 1, application_1543327888220_0001, container_e14_1543327888220_0001_01_000002, /hadoop/yarn/local/usercache/hdp/appcache/application_1543327888220_0001/container_e14_1543327888220_0001_01_000002, /hadoop/yarn/local/nmPrivate/application_1543327888220_0001/container_e14_1543327888220_0001_01_000002/launch_container.sh, /hadoop/yarn/local/nmPrivate/application_1543327888220_0001/container_e14_1543327888220_0001_01_000002/container_e14_1543327888220_0001_01_000002.tokens, /hadoop/yarn/local/nmPrivate/application_1543327888220_0001/container_e14_1543327888220_0001_01_000002/container_e14_1543327888220_0001_01_000002.pid, /hadoop/yarn/local, /hadoop/yarn/log, cgroups=none]
2018-11-27 22:13:35,921 WARN runtime.DefaultLinuxContainerRuntime (DefaultLinuxContainerRuntime.java:launchContainer(107)) - Launch container failed. Exception: org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationException: ExitCodeException exitCode=255:
at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:175)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DefaultLinuxContainerRuntime.launchContainer(DefaultLinuxContainerRuntime.java:103)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime.launchContainer(DelegatingLinuxContainerRuntime.java:89)
at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:392)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:317)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:83)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
Caused by: ExitCodeException exitCode=255:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:933)
at org.apache.hadoop.util.Shell.run(Shell.java:844)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1123)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:150)
... 9 more
没有关于特权的日志,有人有想法吗?
谢谢!
答案 0 :(得分:0)
问题已解决,问题是提交的作业本身不是YARN /特权。
建议您最好尝试在容器日志而不是resourcemanager / nodemanager日志中查找详细信息。