我正在尝试在IntelliJ IDEA上学习Scala-Spark JDBC程序。为此,我创建了一个Scala SBT项目,项目结构如下:
在类中编写JDBC连接参数之前,我试图加载包含我所有连接属性的属性文件,并尝试显示它们是否正确加载,如下所示:
testconnection.properties:
devUserName=username
devPassword=password
gpDriverClass=org.postgresql.Driver
gpDevUrl=jdbc:url
代码:
package com.yearpartition.obj
import java.io.FileInputStream
import java.util.Properties
import org.apache.spark.sql.SparkSession
import org.apache.spark.SparkConf
object PartitionRetrieval {
var conf = new SparkConf().setAppName("Spark-JDBC")
val conFile = "/home/hmusr/ReconTest/inputdir/testconnection.properties"
val properties = new Properties()
properties.load(new FileInputStream(conFile))
val connectionUrl = properties.getProperty("gpDevUrl")
val devUserName=properties.getProperty("devUserName")
val devPassword=properties.getProperty("devPassword")
val gpDriverClass=properties.getProperty("gpDriverClass")
println("connectionUrl: " + connectionUrl)
Class.forName(gpDriverClass).newInstance()
def main(args: Array[String]): Unit = {
val spark = SparkSession.builder().enableHiveSupport().config(conf).master("local[2]").getOrCreate()
println("connectionUrl: " + connectionUrl)
}
}
build.sbt的内容:
libraryDependencies ++= Seq(
"org.apache.spark" %% "spark-core" % "2.0.0" % "provided",
"org.apache.spark" %% "spark-sql" % "2.0.0" % "provided",
"org.json4s" %% "json4s-jackson" % "3.2.11" % "provided",
"org.postgresql" % "postgresql" % "42.1.1" % "provided",
"org.apache.httpcomponents" % "httpclient" % "4.5.3"
)
当我执行spark-submit时,出现异常:
Caused by: java.lang.ClassNotFoundException: org.postgresql.Driver
火花提交命令:
SPARK_MAJOR_VERSION=2 spark-submit --class com.yearpartition.obj.PartitionRetrieval yearpartition_2.11-0.1.jar --driver-class-path /home/hmusr/jars/postgresql-42.1.4.jar --jars /home/hmusr/jars/postgresql-42.1.4.jar
我在目录/ home / hmusr / jars /和sbt依赖项中都有postgres jar文件。 谁能让我知道导致此问题的原因以及如何解决?
答案 0 :(得分:0)
您需要在类路径中具有驱动程序类。从here
下载此jar使用--jars
将其传递给spark-submit
。
注意--jars
在local
模式下不起作用。
在sbt项目中,构建时将其添加到您的lib目录中。