Scala通过JDBC连接HIVE - HDP

时间:2016-05-09 19:10:54

标签: scala jdbc hive

我试图连接HIVE(在Hortonworks的沙箱中)并且我接收到以下信息:

线程中的异常" main" java.sql.SQLException:没有为jdbc找到合适的驱动程序:hive2://sandbox.hortonworks.com:10000 / default

Maven依赖项:

<dependencies>
    <dependency>                                                      
        <groupId>org.apache.spark</groupId>                       
        <artifactId>spark-core_2.10</artifactId>                  
        <version>${spark.version}</version>                       
        <scope>provided</scope>                                   
    </dependency>                                                     
    <dependency>                                                      
        <groupId>org.apache.spark</groupId>                       
        <artifactId>spark-sql_2.10</artifactId>                   
        <version>${spark.version}</version>                       
        <scope>provided</scope>                                   
    </dependency>                                                     
    <dependency>                                                      
        <groupId>org.apache.spark</groupId>                       
        <artifactId>spark-hive_2.10</artifactId>                  
        <version>${spark.version}</version>                       
        <scope>provided</scope>                                   
    </dependency>                                                     
</dependencies>  

代码:

   // **** SetMaster is Local only to test *****                                  
    // Set context                                                                
    val sparkConf = new SparkConf().setAppName("process").setMaster("local")      
    val sc = new SparkContext(sparkConf)                                          
    val hiveContext = new HiveContext(sc)                                         

    // Set HDFS                                                                   
    System.setProperty("HADOOP_USER_NAME", "hdfs")                                
    val hdfsconf = SparkHadoopUtil.get.newConfiguration(sc.getConf)               
    hdfsconf.set("fs.defaultFS", "hdfs://sandbox.hortonworks.com:8020")           
    val hdfs = FileSystem.get(hdfsconf)                                           

    // Set Hive Connector                                                         
    val url = "jdbc:hive2://sandbox.hortonworks.com:10000/default"                
    val user = "username"                                                         
    val password = "password"                                                     

    hiveContext.read.format("jdbc").options(Map("url" -> url,                     
    "user" -> user,                                                               
    "password" -> password,                                                       
    "dbtable" -> "tablename")).load()                 

1 个答案:

答案 0 :(得分:1)

您需要在应用程序类路径中使用Hive JDBC驱动程序:

<dependency>                                                      
    <groupId>org.apache.hive</groupId>                       
    <artifactId>hive-jdbc</artifactId>                  
    <version>1.2.1</version>                       
    <scope>provided</scope>                                   
</dependency> 

另外,在选项中明确指定驱动程序:

"driver" -> "org.apache.hive.jdbc.HiveDriver"

但是,最好跳过JDBC并使用与Hive的本机Spark集成,因为它可以使用Hive Metastore。见http://spark.apache.org/docs/latest/sql-programming-guide.html#hive-tables