通过Knox

时间:2016-10-17 14:32:01

标签: apache-spark-sql kerberos knox-gateway

我正在尝试通过Knox在使用Kerberos保护的集群中连接到SparkSQL thriftserver(Spark 1.6.2)(Hadoop发行版是HDP 2.4.2)。我们为Hive提供了相同的架构,它运行良好。由于Spark使用相同的thriftserver,我认为做同样的事情本来是微不足道的,但实际上并非如此。

通过Knox连接时,Spark thriftserver抛出的错误是:

16/10/17 15:25:39 ERROR ThriftHttpServlet: Failed to authenticate with hive/_HOST kerberos principal
16/10/17 15:25:39 ERROR ThriftHttpServlet: Error: 
org.apache.hive.service.auth.HttpAuthenticationException: java.lang.reflect.UndeclaredThrowableException
at org.apache.hive.service.cli.thrift.ThriftHttpServlet.doKerberosAuth(ThriftHttpServlet.java:361)
at org.apache.hive.service.cli.thrift.ThriftHttpServlet.doPost(ThriftHttpServlet.java:136)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:755)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:848)
at org.spark-project.jetty.servlet.ServletHolder.handle(ServletHolder.java:684)
at org.spark-project.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:501)
at org.spark-project.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:229)
at org.spark-project.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086)
at org.spark-project.jetty.servlet.ServletHandler.doScope(ServletHandler.java:428)
at org.spark-project.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
at org.spark-project.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020)
at org.spark-project.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
at org.spark-project.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
at org.spark-project.jetty.server.Server.handle(Server.java:366)
at org.spark-project.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:494)
at org.spark-project.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:982)
at org.spark-project.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1043)
at org.spark-project.jetty.http.HttpParser.parseNext(HttpParser.java:957)
at org.spark-project.jetty.http.HttpParser.parseAvailable(HttpParser.java:240)
at org.spark-project.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82)
at org.spark-project.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:667)
at org.spark-project.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: java.lang.reflect.UndeclaredThrowableException
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1727)
at org.apache.hive.service.cli.thrift.ThriftHttpServlet.doKerberosAuth(ThriftHttpServlet.java:358)
... 24 more
Caused by: org.apache.hive.service.auth.HttpAuthenticationException: Authorization header received from the client is empty.
at org.apache.hive.service.cli.thrift.ThriftHttpServlet.getAuthHeader(ThriftHttpServlet.java:502)
at org.apache.hive.service.cli.thrift.ThriftHttpServlet.access$100(ThriftHttpServlet.java:68)
at org.apache.hive.service.cli.thrift.ThriftHttpServlet$HttpKerberosServerAction.run(ThriftHttpServlet.java:403)
at org.apache.hive.service.cli.thrift.ThriftHttpServlet$HttpKerberosServerAction.run(ThriftHttpServlet.java:366)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
... 25 more

有没有人对此有所了解以及如何解决这个问题?

谢谢你, 马可

1 个答案:

答案 0 :(得分:1)

与HiveServer2类似,空客户端授权实际上可能是红色鲱鱼。 第一个HTTP请求没有标头,但通常在服务器发出SPNEGO质询后发送。

我实际上并不知道SparkSQL thrift服务器可以像Hive一样使用。您知道它是否具有受信任的代理支持 - 正如在Hadoop中的许多服务中实现的那样?这允许第三方组件(如Apache Knox)通过doAs查询参数声明经过身份验证的用户名来代表另一个用户。它还确保doAs来自它所信任的身份。在这种情况下,通过kerberos / SPNEGO身份验证。

如果它不支持Trusted Proxies,那么它将无法直接使用。要么需要将它添加到SparkSQL thrift服务器,要么添加为Knox中为SparkSQL创建的自定义调度提供程序。自定义调度将允许我们按照SparkSQL的预期传播用户身份。

希望这有帮助。

- 拉里