Question

My Hive版本是1.1.0，spark是1.6.0 没有连接问题。我能够成功连接。

在使用配置单元连接导入数据或创建数据链接时连接后，我可以看到数据库名称和表属于它但收到错误（java.lang.IllegalArgumentException: java.net.UnknownHostException: xxx-nameservice）从表中检索数据时。以下是我的代码：

val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
hiveContext.setConf("hive.metastore.uris", prop.getProperty("hive.metastore.uris"))
hiveContext.setConf("hive.metastore.sasl.enabled", prop.getProperty("hive.metastore.sasl.enabled"))
hiveContext.setConf("hive.security.authorization.enabled", prop.getProperty("hive.security.authorization.enabled"))
hiveContext.setConf("hive.metastore.kerberos.principal", prop.getProperty("hive.metastore.kerberos.principal"))
hiveContext.setConf("hive.metastore.execute.setugi", prop.getProperty("hive.metastore.execute.setugi"))
hiveContext.sql("use abc")   
hiveContext.sql("show tables").show(4) // This is working
hiveContext.sql("select * from abc.tab1 limit 10").show(2)

以下是问题：

java.lang.IllegalArgumentException：java.net.UnknownHostException：xxx-nameservice 在org.apache.hadoop.security.SecurityUtil.buildTokenService（SecurityUtil.java:406）在org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy（NameNodeProxies.java:310）在org.apache.hadoop.hdfs.NameNodeProxies.createProxy（NameNodeProxies.java:176）在org.apache.hadoop.hdfs.DFSClient。（DFSClient.java:728）在org.apache.hadoop.hdfs.DFSClient。（DFSClient.java：671）在org.apache.hadoop.hdfs.DistributedFileSystem.initialize（DistributedFileSystem.java:155）在org.apache.hadoop.fs.FileSystem.createFileSystem（FileSystem.java:2800）在org.apache.hadoop.fs.FileSystem.access $ 200（FileSystem.java:98）在org.apache.hadoop.fs.FileSystem $ Cache.getInternal（FileSystem.java:2837）在org.apache.hadoop.fs.FileSystem $ Cache.get（FileSystem.java:2819）在org.apache.hadoop.fs.FileSystem.get（FileSystem.java:387）在org.apache.hadoop.fs.Path.getFileSystem（Path.java:296） at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal（TokenCache.java:97） at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes（TokenCache.java:80）在org.apache.hadoop.mapred.FileInputFormat.listStatus（FileInputFormat.java:206）在org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat.listStatus（AvroContainerInputFormat.java:42）在org.apache.hadoop.mapred.FileInputFormat.getSplits（FileInputFormat.java:315）在org.apache.spark.rdd.HadoopRDD.getPartitions（HadoopRDD.scala：202）在org.apache.spark.rdd.RDD $$ anonfun $ partitions $ 2.apply（RDD.scala：239）在org.apache.spark.rdd.RDD $$ anonfun $ partitions $ 2.apply（RDD.scala：237）在scala.Option.getOrElse（Option.scala：120）在org.apache.spark.rdd.RDD.partitions（RDD.scala：237）在org.apache.spark.rdd.MapPartitionsRDD.getPartitions（MapPartitionsRDD.scala：35）在org.apache.spark.rdd.RDD $$ anonfun $ partitions $ 2.apply（RDD.scala：239）

Kerberos启用远程Hive元存储（hive-1.1.0）访问问题使用Spark（1.6.0）SQL

0 个答案: