Oozie Spark使用kerberos访问蜂巢

时间:2018-09-26 18:31:09

标签: hive oozie spark-java oozie-workflow mit-kerberos

在oozie中执行火花过程时,出现以下错误。找不到数据库。

2018-09-26 15:27:23,576 INFO [main] org.apache.spark.deploy.yarn.Client: 
     client token: Token { kind: YARN_CLIENT_TOKEN, service:  }
     diagnostics: User class threw exception: org.apache.spark.sql.catalyst.analysis.NoSuchDatabaseException: Database 'zdm_datos_externos' not found;
     ApplicationMaster host: 10.74.234.6
     ApplicationMaster RPC port: 0
     queue: default
     queue user: administrador
     start time: 1537986410475
     final status: FAILED
     tracking URL: https://BR-PC-CentOS-02:26001/proxy/application_1537467570666_4127/
     user: administrador

enter image description here

enter image description here

这是我的火花配置

    String warehouseLocation = new File("spark-warehouse").getAbsolutePath();
    SparkSession spark = SparkSession
            .builder()
            .appName("Java Spark Hive Example")
            .master("yarn")
            .config("spark.sql.warehouse.dir", warehouseLocation)
            .config("spark.driver.maxResultSize", "3g")
            .config("spark.debug.maxToStringFields", "10000")
            .config("spark.sql.crossJoin.enabled", "true")
            .enableHiveSupport()
            .getOrCreate();
    spark.conf().set("spark.driver.maxResultSize", "3g");
  

metastore,当前连接:1 2018-09-26 17:31:42,598 WARN [主要]   hive.metastore:set_ugi()不成功,可能的原因:新客户端   与旧服务器通话。没有它继续。   org.apache.thrift.transport.TTransportException            在org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)            在org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)            在org.apache.thrift.protocol.TBinaryProtocol.readStringBody(TBinaryProtocol.java:380)            在org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:230)            在org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:77)            在org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore $ Client.recv_set_ugi(ThriftHiveMetastore.java:3748)            在org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore $ Client.set_ugi(ThriftHiveMetastore.java:3734)            在org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:557)            在org.apache.hadoop.hive.metastore.HiveMetaStoreClient。(HiveMetaStoreClient.java:249)            在org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient。(SessionHiveMetaStoreClient.java:74)            在sun.reflect.NativeConstructorAccessorImpl.newInstance0(本机方法)处            在sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)            在sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)            在java.lang.reflect.Constructor.newInstance(Constructor.java:423)            在org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1533)            在org.apache.hadoop.hive.metastore.RetryingMetaStoreClient。(RetryingMetaStoreClient.java:86)            在org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:132)            在org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:104)            在org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3157)            在org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3176)            在org.apache.hadoop.hive.ql.metadata.Hive.getAllFunctions(Hive.java:3409)            在org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:178)            在org.apache.hadoop.hive.ql.metadata.Hive(Hive.java:170)            在sun.reflect.NativeMethodAccessorImpl.invoke0(本机方法)处            在sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)            在sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)            在java.lang.reflect.Method.invoke(Method.java:498)            在org.apache.spark.deploy.yarn.security.HiveCredentialProvider $$ anonfun $ obtainCredentials $ 1.apply $ mcV $ sp(HiveCredentialProvider.scala:91)中            在org.apache.spark.deploy.yarn.security.HiveCredentialProvider $$ anonfun $ obtainCredentials $ 1.apply(HiveCredentialProvider.scala:90)            在org.apache.spark.deploy.yarn.security.HiveCredentialProvider $$ anonfun $ obtainCredentials $ 1.apply(HiveCredentialProvider.scala:90)            在org.apache.spark.deploy.yarn.security.HiveCredentialProvider $$ anon $ 1.run(HiveCredentialProvider.scala:124)            在java.security.AccessController.doPrivileged(本机方法)            在javax.security.auth.Subject.doAs(Subject.java:422)            在org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1778)            在org.apache.spark.deploy.yarn.security.HiveCredentialProvider.doAsRealUser(HiveCredentialProvider.scala:123)            在org.apache.spark.deploy.yarn.security.HiveCredentialProvider.obtainCredentials(HiveCredentialProvider.scala:90)            在org.apache.spark.deploy.yarn.security.ConfigurableCredentialManager $$ anonfun $ obtainCredentials $ 2.apply(ConfigurableCredentialManager.scala:82)            在org.apache.spark.deploy.yarn.security.ConfigurableCredentialManager $$ anonfun $ obtainCredentials $ 2.apply(ConfigurableCredentialManager.scala:80)            在scala.collection.TraversableLike $$ anonfun $ flatMap $ 1.apply(TraversableLike.scala:241)            在scala.collection.TraversableLike $$ anonfun $ flatMap $ 1.apply(TraversableLike.scala:241)            在scala.collection.Iterator $ class.foreach(Iterator.scala:893)            在scala.collection.AbstractIterator.foreach(Iterator.scala:1336)            在scala.collection.MapLike $ DefaultValuesIterable.foreach(MapLike.scala:206)            在scala.collection.TraversableLike $ class.flatMap(TraversableLike.scala:241)            在scala.collection.AbstractTraversable.flatMap(Traversable.scala:104)            在org.apache.spark.deploy.yarn.security.ConfigurableCredentialManager.obtainCredentials(ConfigurableCredentialManager.scala:80)处            在org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:430)            在org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:915)            在org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:195)            在org.apache.spark.deploy.yarn.Client.run(Client.scala:1205)            在org.apache.spark.deploy.yarn.Client $ .main(Client.scala:1261)            在org.apache.spark.deploy.yarn.Client.main(Client.scala)            在sun.reflect.NativeMethodAccessorImpl.invoke0(本机方法)处            在sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)            在sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)            在java.lang.reflect.Method.invoke(Method.java:498)            在org.apache.spark.deploy.SparkSubmit $ .org $ apache $ spark $ deploy $ SparkSubmit $$ runMain(SparkSubmit.scala:761)            在org.apache.spark.deploy.SparkSubmit $ .doRunMain $ 1(SparkSubmit.scala:190)            在org.apache.spark.deploy.SparkSubmit $ .submit(SparkSubmit.scala:215)            在org.apache.spark.deploy.SparkSubmit $ .main(SparkSubmit.scala:129)            在org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)            在org.apache.oozie.action.hadoop.SparkMain.runSpark(SparkMain.java:113)            在org.apache.oozie.action.hadoop.SparkMain.run(SparkMain.java:104)            在org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:47)            在org.apache.oozie.action.hadoop.SparkMain.main(SparkMain.java:38)            在sun.reflect.NativeMethodAccessorImpl.invoke0(本机方法)处            在sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)            在sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)            在java.lang.reflect.Method.invoke(Method.java:498)            在org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:238)            在org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)            在org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:459)            在org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)            在org.apache.hadoop.mapred.YarnChild $ 2.run(YarnChild.java:187)            在java.security.AccessController.doPrivileged(本机方法)            在javax.security.auth.Subject.doAs(Subject.java:422)

1 个答案:

答案 0 :(得分:0)

您使用的是哪个版本的Spark?您是否在sparkSession上启用了Hive支持?

sparkBuilder.enableHiveSupport().appName(appName).getOrCreate()