解决: prop.setProperty(“driver”,“oracle.jdbc.driver.OracleDriver”)必须将此行添加到连接属性中。
我正在尝试在当地吃午餐。我通过maven创建了一个带依赖关系的jar。
这是我的pom.xml
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.agildata</groupId>
<artifactId>spark-rdd-dataframe-dataset</artifactId>
<packaging>jar</packaging>
<version>1.0</version>
<properties>
<exec-maven-plugin.version>1.4.0</exec-maven-plugin.version>
<spark.version>1.6.0</spark.version>
</properties>
<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.11</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>com.oracle</groupId>
<artifactId>ojdbc7</artifactId>
<version>12.1.0.2</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.2</version>
<configuration>
<source>1.8</source>
<target>1.8</target>
</configuration>
</plugin>
<plugin>
<groupId>net.alchim31.maven</groupId>
<artifactId>scala-maven-plugin</artifactId>
<executions>
<execution>
<id>scala-compile-first</id>
<phase>process-resources</phase>
<goals>
<goal>add-source</goal>
<goal>compile</goal>
</goals>
</execution>
<execution>
<id>scala-test-compile</id>
<phase>process-test-resources</phase>
<goals>
<goal>testCompile</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-assembly-plugin</artifactId>
<version>2.4.1</version>
<configuration>
<!-- get all project dependencies -->
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
<!-- MainClass in mainfest make a executable jar -->
<archive>
<manifest>
<mainClass>example.dataframe.ScalaDataFrameExample</mainClass>
</manifest>
</archive>
</configuration>
<executions>
<execution>
<id>make-assembly</id>
<!-- bind to the packaging phase -->
<phase>package</phase>
<goals>
<goal>single</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
</project>
我运行mvn package
命令,构建成功。在我尝试以这种方式运行作业之后:GMAC:bin gabor_dev$ sh spark-submit --class example.dataframe.ScalaDataFrameExample --master spark://QGMAC.local:7077 /Users/gabor_dev/IdeaProjects/dataframe/target/spark-rdd-dataframe-dataset-1.0-jar-with-dependencies.jar
但它抛出了这个:Exception in thread "main" java.sql.SQLException: No suitable driver
完整的错误消息:
16/07/08 13:09:22 INFO BlockManagerMaster: Registered BlockManager
Exception in thread "main" java.sql.SQLException: No suitable driver
at java.sql.DriverManager.getDriver(DriverManager.java:315)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$2.apply(JdbcUtils.scala:50)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$2.apply(JdbcUtils.scala:50)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.createConnectionFactory(JdbcUtils.scala:49)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:120)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation.<init>(JDBCRelation.scala:91)
at org.apache.spark.sql.DataFrameReader.jdbc(DataFrameReader.scala:222)
at org.apache.spark.sql.DataFrameReader.jdbc(DataFrameReader.scala:146)
at example.dataframe.ScalaDataFrameExample$.main(ScalaDataFrameExample.scala:30)
at example.dataframe.ScalaDataFrameExample.main(ScalaDataFrameExample.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
16/07/08 13:09:22 INFO SparkContext: Invoking stop() from shutdown hook
有趣的是,如果我在IntelliJ IDEA嵌套控制台中构建这种方式:mvn package exec:java -Dexec.mainClass=example.dataframe.ScalaDataFrameExample
它正在运行,并且没有错误。
这是相关的scala代码部分:
val sc = new SparkContext(conf)
val sqlContext = new SQLContext(sc)
val url="jdbc:oracle:thin:@xxx.xxx.xx:1526:SIDNAME"
val prop = new java.util.Properties
prop.setProperty("user" , "usertst")
prop.setProperty("password" , "usertst")
val people = sqlContext.read.jdbc(url,"table_name",prop)
people.show()
我检查了我的jar文件,它包含所有依赖项。任何人都可以帮我解决这个问题。谢谢!
答案 0 :(得分:10)
因此,缺少的驱动程序是JDBC驱动程序,您必须将其添加到SparkSQL配置中。您可以在应用程序提交中按指定的by this answer执行此操作,也可以通过您的Properties对象执行此操作,就像使用此行一样:
prop.setProperty("driver", "oracle.jdbc.driver.OracleDriver")
答案 1 :(得分:0)
这是您使用Spark连接到Postgresql的方式。
SparkSession sparkSession = SparkSession.builder().
appName("dky").
master("local[*]").
getOrCreate();
Logger.getLogger("org.apache").setLevel(Level.WARN);
Properties properties = new Properties();
properties.put("user", "your user name");
properties.put("password", "your password");
Dataset<Row> jdbcDF = sparkSession.read().option("driver", "org.postgresql.Driver")
.jdbc("jdbc:postgresql://localhost:5432/postgres", "your table name along with schema name", properties);
jdbcDF.show();