用于匿名UID的Apache Spark独立(无用户名)

时间:2017-07-19 18:21:08

标签: apache-spark docker openshift

我在OpenShift平台上启动Apache spark slave节点。 OpenShift在内部以匿名用户身份启动docker镜像(用户没有名字,只有UID)。我得到以下异常

17/07/17 16:46:53 INFO SignalUtils: Registered signal handler for INT
12  17/07/17 16:46:55 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
13  Exception in thread "main" java.io.IOException: failure to login
14      at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:824)
15      at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:761)
16      at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:634)
17      at org.apache.spark.util.Utils$$anonfun$getCurrentUserName$1.apply(Utils.scala:2391)
18      at org.apache.spark.util.Utils$$anonfun$getCurrentUserName$1.apply(Utils.scala:2391)
19      at scala.Option.getOrElse(Option.scala:121)
20      at org.apache.spark.util.Utils$.getCurrentUserName(Utils.scala:2391)
21      at org.apache.spark.SecurityManager.<init>(SecurityManager.scala:221)
22      at org.apache.spark.deploy.worker.Worker$.startRpcEnvAndEndpoint(Worker.scala:714)
23      at org.apache.spark.deploy.worker.Worker$.main(Worker.scala:696)
24      at org.apache.spark.deploy.worker.Worker.main(Worker.scala)
25  Caused by: javax.security.auth.login.LoginException: java.lang.NullPointerException: invalid null input: name
26      at com.sun.security.auth.UnixPrincipal.<init>(UnixPrincipal.java:71)
27      at com.sun.security.auth.module.UnixLoginModule.login(UnixLoginModule.java:133)
28      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
29      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
30      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
31      at java.lang.reflect.Method.invoke(Method.java:497)
32      at javax.security.auth.login.LoginContext.invoke(LoginContext.java:755)
33      at javax.security.auth.login.LoginContext.access$000(LoginContext.java:195)
34      at javax.security.auth.login.LoginContext$4.run(LoginContext.java:682)
35      at javax.security.auth.login.LoginContext$4.run(LoginContext.java:680)
36      at java.security.AccessController.doPrivileged(Native Method)
37      at javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:680)
38      at javax.security.auth.login.LoginContext.login(LoginContext.java:587)
39      at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:799)
40      at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:761)
41      at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:634)
42      at org.apache.spark.util.Utils$$anonfun$getCurrentUserName$1.apply(Utils.scala:2391)
43      at org.apache.spark.util.Utils$$anonfun$getCurrentUserName$1.apply(Utils.scala:2391)
44      at scala.Option.getOrElse(Option.scala:121)
45      at org.apache.spark.util.Utils$.getCurrentUserName(Utils.scala:2391)
46      at org.apache.spark.SecurityManager.<init>(SecurityManager.scala:221)
47      at org.apache.spark.deploy.worker.Worker$.startRpcEnvAndEndpoint(Worker.scala:714)
48      at org.apache.spark.deploy.worker.Worker$.main(Worker.scala:696)
49      at org.apache.spark.deploy.worker.Worker.main(Worker.scala)
50  
51      at javax.security.auth.login.LoginContext.invoke(LoginContext.java:856)
52      at javax.security.auth.login.LoginContext.access$000(LoginContext.java:195)
53      at javax.security.auth.login.LoginContext$4.run(LoginContext.java:682)
54      at javax.security.auth.login.LoginContext$4.run(LoginContext.java:680)
55      at java.security.AccessController.doPrivileged(Native Method)
56      at javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:680)
57      at javax.security.auth.login.LoginContext.login(LoginContext.java:587)
58      at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:799)
59      ... 10 more

17/07/17 16:46:53 INFO SignalUtils: Registered signal handler for INT 12 17/07/17 16:46:55 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 13 Exception in thread "main" java.io.IOException: failure to login 14 at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:824) 15 at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:761) 16 at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:634) 17 at org.apache.spark.util.Utils$$anonfun$getCurrentUserName$1.apply(Utils.scala:2391) 18 at org.apache.spark.util.Utils$$anonfun$getCurrentUserName$1.apply(Utils.scala:2391) 19 at scala.Option.getOrElse(Option.scala:121) 20 at org.apache.spark.util.Utils$.getCurrentUserName(Utils.scala:2391) 21 at org.apache.spark.SecurityManager.<init>(SecurityManager.scala:221) 22 at org.apache.spark.deploy.worker.Worker$.startRpcEnvAndEndpoint(Worker.scala:714) 23 at org.apache.spark.deploy.worker.Worker$.main(Worker.scala:696) 24 at org.apache.spark.deploy.worker.Worker.main(Worker.scala) 25 Caused by: javax.security.auth.login.LoginException: java.lang.NullPointerException: invalid null input: name 26 at com.sun.security.auth.UnixPrincipal.<init>(UnixPrincipal.java:71) 27 at com.sun.security.auth.module.UnixLoginModule.login(UnixLoginModule.java:133) 28 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 29 at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 30 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 31 at java.lang.reflect.Method.invoke(Method.java:497) 32 at javax.security.auth.login.LoginContext.invoke(LoginContext.java:755) 33 at javax.security.auth.login.LoginContext.access$000(LoginContext.java:195) 34 at javax.security.auth.login.LoginContext$4.run(LoginContext.java:682) 35 at javax.security.auth.login.LoginContext$4.run(LoginContext.java:680) 36 at java.security.AccessController.doPrivileged(Native Method) 37 at javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:680) 38 at javax.security.auth.login.LoginContext.login(LoginContext.java:587) 39 at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:799) 40 at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:761) 41 at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:634) 42 at org.apache.spark.util.Utils$$anonfun$getCurrentUserName$1.apply(Utils.scala:2391) 43 at org.apache.spark.util.Utils$$anonfun$getCurrentUserName$1.apply(Utils.scala:2391) 44 at scala.Option.getOrElse(Option.scala:121) 45 at org.apache.spark.util.Utils$.getCurrentUserName(Utils.scala:2391) 46 at org.apache.spark.SecurityManager.<init>(SecurityManager.scala:221) 47 at org.apache.spark.deploy.worker.Worker$.startRpcEnvAndEndpoint(Worker.scala:714) 48 at org.apache.spark.deploy.worker.Worker$.main(Worker.scala:696) 49 at org.apache.spark.deploy.worker.Worker.main(Worker.scala) 50 51 at javax.security.auth.login.LoginContext.invoke(LoginContext.java:856) 52 at javax.security.auth.login.LoginContext.access$000(LoginContext.java:195) 53 at javax.security.auth.login.LoginContext$4.run(LoginContext.java:682) 54 at javax.security.auth.login.LoginContext$4.run(LoginContext.java:680) 55 at java.security.AccessController.doPrivileged(Native Method) 56 at javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:680) 57 at javax.security.auth.login.LoginContext.login(LoginContext.java:587) 58 at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:799) 59 ... 10 more 我尝试在spark-default.conf上设置以下属性仍然没有用。

请你帮我解决这个问题。

由于

纳温

2 个答案:

答案 0 :(得分:5)

以下是一种不需要nss_wrapper的替代方法。

默认情况下,OpenShift容器使用匿名用户ID和组ID 0(又名&#34; root&#34;组)运行。首先,设置图像,使/etc/passwd归group-id 0所有,并具有组写访问权限,例如此Dockerfile代码段:

RUN chgrp root /etc/passwd && chmod ug+rw /etc/passwd

然后,您可以在容器启动时添加以下逻辑,例如,以下脚本可用作ENTRYPOINT

#!/bin/bash

myuid=$(id -u)
mygid=$(id -g)
uidentry=$(getent passwd $myuid)

if [ -z "$uidentry" ] ; then
    # assumes /etc/passwd has root-group (gid 0) ownership
    echo "$myuid:x:$myuid:$mygid:anonymous uid:/tmp:/bin/false" >> /etc/passwd
fi

exec "$@"

此入口点脚本将自动为匿名uid提供passwd文件条目,以便需要它的工具不会失败。

关于OpenShift中匿名uid的这篇以及相关主题有一篇很好的博客文章: https://blog.openshift.com/jupyter-on-openshift-part-6-running-as-an-assigned-user-id/

答案 1 :(得分:3)

(我保留这个答案,因为了解nss_wrapper很有用,但this other answer无需安装或使用nss_wrapper就可以了。

Spark希望能够在passwd中查找其UID。可以使用nss_wrapper解决此集成纠结;可以在此处找到在图像入口点中使用此解决方案的一个很好的示例:

https://github.com/radanalyticsio/openshift-spark/blob/master/scripts/spark/added/entrypoint

# spark likes to be able to lookup a username for the running UID, if
# no name is present fake it.
cat /etc/passwd > /tmp/passwd
echo "$(id -u):x:$(id -u):$(id -g):dynamic uid:$SPARK_HOME:/bin/false" >> /tmp/passwd

export NSS_WRAPPER_PASSWD=/tmp/passwd
# NSS_WRAPPER_GROUP must be set for NSS_WRAPPER_PASSWD to be used
export NSS_WRAPPER_GROUP=/etc/group

export LD_PRELOAD=libnss_wrapper.so

exec "$@"

如果您对可以在Openshift上使用的预制Spark图像感兴趣,我建议从这里开始:

https://github.com/radanalyticsio/openshift-spark

这些图像是作为Radanalytics.io社区项目工具的一部分生成的,该项目已经生成了许多工具,可以在Openshift中轻松创建火花簇。您可以在此处了解有关该项目的更多信息:

https://radanalytics.io/get-started