尝试从Spark中将一些数据插入Cassandra表时,我收到以下错误。
com.google.common.util.concurrent.ExecutionError: java.lang.NoClassDefFoundError: Could not initialize class com.datastax.driver.core.Cluster
at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2261)
at com.google.common.cache.LocalCache.get(LocalCache.java:4000)
at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:4004)
at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4874)
at com.github.adejanovski.cassandra.jdbc.CassandraDriver.connect(CassandraDriver.java:102)
at org.apache.spark.sql.execution.datasources.jdbc.DriverWrapper.connect(DriverWrapper.scala:45)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$createConnectionFactory$1.apply(JdbcUtils.scala:59)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$createConnectionFactory$1.apply(JdbcUtils.scala:50)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.savePartition(JdbcUtils.scala:538)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$saveTable$1.apply(JdbcUtils.scala:670)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$saveTable$1.apply(JdbcUtils.scala:670)
at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$29.apply(RDD.scala:925)
at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$29.apply(RDD.scala:925)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1944)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1944)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:99)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NoClassDefFoundError: Could not initialize class com.datastax.driver.core.Cluster
at com.github.adejanovski.cassandra.jdbc.SessionHolder.createSession(SessionHolder.java:137)
at com.github.adejanovski.cassandra.jdbc.SessionHolder.<init>(SessionHolder.java:83)
at com.github.adejanovski.cassandra.jdbc.CassandraDriver$1.load(CassandraDriver.java:68)
at com.github.adejanovski.cassandra.jdbc.CassandraDriver$1.load(CassandraDriver.java:65)
at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3599)
at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2379)
at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2342)
at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2257)
... 20 more
我能够第一次成功执行代码,但是当我再次执行相同的代码时,我收到了上述错误。
答案 0 :(得分:0)
默认情况下,打包的应用程序仅包含您的类,而不包含您编译的库。因此,火花驱动器没有您使用的任何库,例如Cassandra连接器。
当您打算使用spark-submit时,您需要构建一个包含所有依赖库的超级jar。 使用Maven或SBT可以帮助您。