尝试使用spark从远程hive2服务器获取表时出错

时间:2017-06-15 10:50:40

标签: apache-spark hive apache-spark-sql

我正在尝试使用以下代码从spark访问远程hive2服务器中的表:

import org.apache.spark.SparkContext, org.apache.spark.SparkConf, org.apache.spark.sql._
import com.typesafe.config._
import java.io._
import org.apache.hadoop.fs._
import org.apache.spark.sql.hive.HiveContext
import org.apache.spark.sql.Row
import org.apache.spark.sql.SparkSession

object stack {
  def main(args: Array[String]) {
 val warehouseLocation = "/usr/hive/warehouse"
System.setProperty("javax.jdo.option.ConnectionURL","jdbc:mysql://sparkserver:3306/metastore?createDatabaseIfNotExist=true")
System.setProperty("javax.jdo.option.ConnectionUserName","hiveroot")
System.setProperty("javax.jdo.option.ConnectionPassword","hivepassword")
System.setProperty("hive.exec.scratchdir","/tmp/hive/${user.name}")
System.setProperty("spark.sql.warehouse.dir", warehouseLocation)
   // System.setProperty("hive.metastore.uris", "thrift://sparkserver:9083")
System.setProperty("javax.jdo.option.ConnectionDriverName","com.mysql.jdbc.Driver")
System.setProperty("hive.metastore.warehouse.dir","/user/hive/warehouse")


val spark = SparkSession.builder().master("local")
.appName("spark remote")
   // .config("javax.jdo.option.ConnectionURL","jdbc:mysql://sparkserver:3306/metastore?createDatabaseIfNotExist=true")
 .config("javax.jdo.option.ConnectionURL","jdbc:mysql://sparkserver:3306/metastore?createDatabaseIfNotExist=true")
  .config("javax.jdo.option.ConnectionUserName","hiveroot")
  .config("javax.jdo.option.ConnectionPassword","hivepassword")
  .config("hive.exec.scratchdir","/tmp/hive/${user.name}")
  .config("spark.sql.warehouse.dir", warehouseLocation)
//  .config("hive.metastore.uris", "thrift://sparkserver:9083")  
  .config("javax.jdo.option.ConnectionDriverName","com.mysql.jdbc.Driver")
  .config("hive.querylog.location","/tmp/hivequerylogs/${user.name}")
  .config("hive.support.concurrency","false")
  .config("hive.server2.enable.doAs","true")
  .config("hive.server2.authentication","PAM")
  .config("hive.server2.custom.authentication.class","org.apache.hive.service.auth.PamAuthenticationProvider")
  .config("hive.server2.authentication.pam.services","sshd,sudo")
  .config("hive.stats.dbclass","jdbc:mysql")
  .config("hive.stats.jdbcdriver","com.mysql.jdbc.Driver")
  .config("hive.session.history.enabled","true")
  .config("hive.metastore.schema.verification","false")
  .config("hive.optimize.sort.dynamic.partition","false")
  .config("hive.optimize.insert.dest.volume","false")
  .config("datanucleus.fixedDatastore","true")
  .config("hive.metastore.warehouse.dir","/user/hive/warehouse")
  .config("datanucleus.autoCreateSchema","false")
  .config("datanucleus.schema.autoCreateAll","true")
  .config("datanucleus.schema.validateConstraints","true")
  .config("datanucleus.schema.validateColumns","true")
  .config("datanucleus.schema.validateTables","true")       
  .config("fs.default.name","hdfs://sparkserver:54310")
  .config("dfs.namenode.name.dir","/usr/local/hadoop_tmp/hdfs/namenode")
  .config("dfs.datanode.name.dir","/usr/local/hadoop_tmp/hdfs/datanode")
  .enableHiveSupport()
  .getOrCreate()

import spark.implicits._
import spark.sql

sql("select * from sample.source").collect.foreach(println)
sql("select * from sample.destination").collect.foreach(println)
  }
}

远程配置单元服务器拒绝了对元存储的连接请求。

错误:无法启动hive-metastore.service:未找到单元hive-metastore.service

谢谢!

2 个答案:

答案 0 :(得分:1)

使用时:.config("hive.metastore.uris", "hive2://hiveserver:9083")hiveserver应该是正确的远程配置服务器的IP。

conf hive.metastore.uris指向hive-metastore服务;如果你在本地运行(在localhost中) - 并且想要远程Metastore;你需要单独启动hive-metastore服务。

`$HIVE_HOME/bin/hive --service metastore` -p 9083

或者 - 默认情况下,Hive使用本地Hive-metastore;所以在这种情况下,你不需要为hive.metastore.uris

设置任何值

并且 - 忘了提及,您正在设置的属性 - 始终使用thrift协议 - 无论是hiveserver1还是hiveserver2。

所以,总是使用这个:

.config("hive.metastore.uris", "thrift://hiveserver:9083")

答案 1 :(得分:1)

通常我们不需要单独指向远程Metastore。

Hive-site.xml将在内部通过jdbc指向Metastore。

在初始化Hive-Context之前,可以在程序中设置相同的conf:

试一试。

System.setProperty("javax.jdo.option.ConnectionURL", "jdbc:mysql://<ip>/metastore?createDatabaseIfNotExist=true")
...("javax.jdo.option.ConnectionDriverName", "com.mysql.jdbc.Driver")
...("javax.jdo.option.ConnectionUserName", "mysql-user")
...("javax.jdo.option.ConnectionPassword", "mysql-passwd")