spark-cassandra java.lang.NoClassDefFoundError:com / datastax / spark / connector / japi / CassandraJavaUtil

时间:2016-04-26 17:09:07

标签: apache-spark cassandra

16/04/26 16:58:46 DEBUG ProtobufRpcEngine: Call: complete took 3ms
Exception in thread "main" java.lang.NoClassDefFoundError: com/datastax/spark/connector/japi/CassandraJavaUtil
        at com.baitic.mcava.lecturahdfssaveincassandra.TratamientoCSV.main(TratamientoCSV.java:123)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
        at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: com.datastax.spark.connector.japi.CassandraJavaUtil
        at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        ... 10 more
16/04/26 16:58:46 INFO SparkContext: Invoking stop() from shutdown hook
16/04/26 16:58:46 INFO SparkUI: Stopped Spark web UI at http://10.128.0.5:4040
16/04/26 16:58:46 INFO SparkDeploySchedulerBackend: Shutting down all executors
16/04/26 16:58:46 INFO SparkDeploySchedulerBackend: Asking each executor to shut down
16/04/26 16:58:46 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
16/04/26 16:58:46 INFO MemoryStore: MemoryStore cleared
16/04/26 16:58:46 INFO BlockManager: BlockManager stopped
16/04/26 16:58:46 INFO BlockManagerMaster: BlockManagerMaster stopped
16/04/26 16:58:46 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
16/04/26 16:58:46 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
16/04/26 16:58:46 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
16/04/26 16:58:46 INFO SparkContext: Successfully stopped SparkContext
16/04/26 16:58:46 INFO ShutdownHookManager: Shutdown hook called
16/04/26 16:58:46 INFO ShutdownHookManager: Deleting directory /srv/spark/tmp/spark-2bf57fa2-a2d5-4f8a-980c-994e56b61c44
16/04/26 16:58:46 DEBUG Client: stopping client from cache: org.apache.hadoop.ipc.Client@3fb9a67f
16/04/26 16:58:46 DEBUG Client: removing client from cache: org.apache.hadoop.ipc.Client@3fb9a67f
16/04/26 16:58:46 DEBUG Client: stopping actual client because no more references remain: org.apache.hadoop.ipc.Client@3fb9a67f
16/04/26 16:58:46 DEBUG Client: Stopping client
16/04/26 16:58:46 DEBUG Client: IPC Client (2107841088) connection to mcava-master/10.128.0.5:54310 from baiticpruebas2: closed
16/04/26 16:58:46 DEBUG Client: IPC Client (2107841088) connection to mcava-master/10.128.0.5:54310 from baiticpruebas2: stopped, remaining connections 0
16/04/26 16:58:46 INFO RemoteActorRefProvider$RemotingTerminator: Remoting shut down.

我制作这个简单的代码:

/  String pathDatosTratados="hdfs://mcava-master:54310/srv/hadoop/data/spark/DatosApp/medidasSensorTratadas.txt";
    String jarPath ="hdfs://mcava-master:54310/srv/hadoop/data/spark/original-LecturaHDFSsaveInCassandra-1.0-SNAPSHOT.jar";
     String jar="hdfs://mcava-master:54310/srv/hadoop/data/spark/spark-cassandra-connector-assembly-1.6.0-M1-4-g6f01cfe.jar";
    String jar2="hdfs://mcava-master:54310/srv/hadoop/data/spark/spark-cassandra-connector-java-assembly-1.6.0-M1-4-g6f01cfe.jar";
     String[] jars= new String[3];
     jars[0]=jarPath;
   jars[2]=jar;
     jars[1]=jar2;

    SparkConf conf=new SparkConf().setAppName("TratamientoCSV").setJars(jars);
    conf.set("spark.cassandra.connection.host", "10.128.0.5");
     conf.set("spark.kryoserializer.buffer.max","512");
     conf.set("spark.kryoserializer.buffer","256");
 //     conf.setJars(jars);
    JavaSparkContext sc= new  JavaSparkContext(conf);

     JavaRDD<String> input= sc.textFile(pathDatos);

我还在spark-default.conf

中放置了cassandra驱动器的路径
spark.driver.extraClassPath      hdfs://mcava-master:54310/srv/hadoop/data/spark/spark-cassandra-connector-java-assembly-1.6.0-M1-4-g6f01cfe.jar
spark.executor.extraClassPath    hdfs://mcava-master:54310/srv/hadoop/data/spark/spark-cassandra-connector-java-assembly-1.6.0-M1-4-g6f01cfe.jar

我也把标志--jars放到了驱动程序的路径上,但我总是犯同样的错误,我不明白为什么?

我在谷歌引擎工作

4 个答案:

答案 0 :(得分:7)

尝试在提交应用时添加包。

$SPARK_HOME/bin/spark-submit --packages datastax:spark-cassandra-connector:1.6.0-M2-s_2.11 ....

答案 1 :(得分:2)

我添加此参数来解决此问题: - package datastax:spark-cassandra-connector:1.6.0-M2-s_2.10。

答案 2 :(得分:0)

我解决了这个问题......我制作了一个带有所有依赖关系的胖jar,并且没有必要仅仅指出对cassandra连接器的引用是对胖罐的引用。

答案 3 :(得分:0)

我在Java程序中使用了Spark,并遇到了同样的问题。 问题是,因为我没有将 spark-cassandra-connector 包含到我项目的maven依赖项中。

    <dependency>
        <groupId>com.datastax.spark</groupId>
        <artifactId>spark-cassandra-connector_2.11</artifactId>
        <version>2.0.7</version> <!-- Check actual version in maven repo -->
    </dependency>

之后我用所有依赖项填充了 fat jar - 它确实有效! 也许它会帮助某人