我正在尝试编写一个简单的程序,将 Spark Dataframe
写入 Redis 缓存。
下面是我的代码:
Dataset<Row> txnDf = sparkSession.read().format("jdbc").option("url", connection)
.option("dbtable", "ao_temp").load();
System.out.println("Record Count:"+txnDf.count());
System.out.println(new Date());
System.out.println("Writing To Redis");
System.out.println(new Date());
txnDf.write()
.format("org.apache.spark.sql.redis")
.option("table", "ao_temp")
.option("key.column", "txn_detail_id")
.mode(SaveMode.Overwrite)
.save();
System.out.println(new Date());
因此,在这里,我使用spark从数据库加载表并将其写入Redis缓存。
我低于例外:
Exception in thread "main" java.lang.ClassNotFoundException: Failed to find data source: org.apache.spark.sql.redis. Please find packages at http://spark.apache.org/third-party-projects.html
at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:635)
at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:241)
at hellospark.HelloRedis.main(HelloRedis.java:137)
Caused by: java.lang.ClassNotFoundException: org.apache.spark.sql.redis.DefaultSource
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$23$$anonfun$apply$15.apply(DataSource.scala:618)
at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$23$$anonfun$apply$15.apply(DataSource.scala:618)
at scala.util.Try$.apply(Try.scala:192)
at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$23.apply(DataSource.scala:618)
at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$23.apply(DataSource.scala:618)
at scala.util.Try.orElse(Try.scala:84)
at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:618)
... 2 more
任何注释或指针都会有用。
谢谢。
编辑
这里指出的问题与版本不匹配有关,但是有人可以指出在SPARK 2.3.0中应将哪些版本用于spark-redis和jedis吗?