我正在尝试使用Spark-cassandra连接器连接到我的Spark Shell中的远程Cassandra集群。但是它会引发一些异常的错误。
我做的是spark-cassandra连接器github页面上提到的平常事情
1)运行外壳程序
$SPARK_HOME/bin/spark-shell --packages datastax:spark-cassandra-connector:2.0.0-s_2.11
2)导入cassandra连接器
import com.datastax.spark.connector._
import org.apache.spark.sql.cassandra._
到目前为止一切正常 但是当我尝试创建rdd
val rdd=sc.cassandraTable("test","user")
它抛出此异常
java.lang.NoClassDefFoundError: org/apache/commons/configuration/ConfigurationException
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at com.datastax.spark.connector.cql.CassandraConnectorConf$.apply(CassandraConnectorConf.scala:257)
at com.datastax.spark.connector.cql.CassandraConnector$.apply(CassandraConnector.scala:189)
at com.datastax.spark.connector.SparkContextFunctions.cassandraTable$default$3(SparkContextFunctions.scala:52)
... 53 elided
Caused by: java.lang.ClassNotFoundException: org.apache.commons.configuration.ConfigurationException
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
Scala版本2.3.1 连接器版本= 2.0.0
答案 0 :(得分:3)
对于Scala 2.3.1版本,您必须使用Spark Connector版本> = 2.3.0-当前版本为2.3.2:
spark-shell --packages com.datastax.spark:spark-cassandra-connector_2.11:2.3.2 \
--conf spark.cassandra.connection.host=<IP>