为什么spark-submit在“-jars中使用Cassandra连接器无法为数据源加载类:org.apache.spark.sql.cassandra”失败?

时间:2016-01-04 10:31:57

标签: apache-spark apache-kafka cassandra-2.0 spark-cassandra-connector

Spark版本:1.4.1

Cassandra版本:2.1.8

Datastax Cassandra Connector:1.4.2-SNAPSHOT.jar

命令我跑了

  

./ spark-submit --jars /usr/local/src/spark-cassandra-connector/spark-cassandra-connector-java/target/scala-2.10/spark-cassandra-connector-java-assembly-1.4。 2-SNAPSHOT.jar --driver-class-path /usr/local/src/spark-cassandra-connector/spark-cassandra-connector-java/target/scala-2.10/spark-cassandra-connector-java-assembly-1.4 .2-SNAPSHOT.jar --jars /usr/local/lib/spark-1.4.1/external/kafka/target/scala-2.10/spark-streaming-kafka_2.10-1.4.1.jar --jars / usr /local/lib/spark-1.4.1/external/kafka-assembly/target/scala-2.10/spark-streaming-kafka-assembly_2.10-1.4.1.jar --driver-class-path / usr / local / lib / spark-1.4.1 / external / kafka / target / scala-2.10 / spark-streaming-kafka_2.10-1.4.1.jar --driver-class-path /usr/local/lib/spark-1.4.1 /external/kafka-assembly/target/scala-2.10/spark-streaming-kafka-assembly_2.10-1.4.1.jar --packages org.apache.spark:spark-streaming-kafka_2.10:1.4.1 - executor-memory 6g --executor-cores 6 --master local [4] kafka_streaming.py

以下是我得到的错误:

Py4JJavaError: An error occurred while calling o169.save.
: java.lang.RuntimeException: Failed to load class for data source: org.apache.spark.sql.cassandra

必须做些傻事。任何回复都将不胜感激。

1 个答案:

答案 0 :(得分:3)

尝试使用相同的--jars选项(以逗号分隔)提供所有jar:

--jars yourFirstJar.jar,yourSecondJar.jar

更方便的开发解决方案是使用maven central(逗号分隔)的jar:

--packages org.apache.spark:spark-streaming-kafka_2.10:1.4.1,com.datastax.spark:spark-cassandra-connector_2.10:1.4.1