无法使用java中的Spark在ElasticSearch中编写(抛出java.lang.IncompatibleClassChangeError:实现类异常)

时间:2015-01-16 09:51:17

标签: java elasticsearch apache-spark

我正在使用一个简单的Java程序将spark JavaRDD索引到Elasticsearch中。我的代码看起来像这样 -

SparkConf conf = new SparkConf().setAppName("IndexDemo").setMaster("spark://ct-0094:7077");

    conf.set("spark.serializer", org.apache.spark.serializer.KryoSerializer.class.getName());
    conf.set("es.index.auto.create", "true"); 
    conf.set("es.nodes", "192.168.50.103");
    conf.set("es.port", "9200");
    JavaSparkContext sc = new JavaSparkContext(conf);
    sc.addJar("./target/SparkPOC-0.0.1-SNAPSHOT-jar-with-dependencies.jar");

    String arrayval = "string";
    List<Data> data = Arrays.asList(
            new Data(1l, 10l, arrayval+"1"),
            new Data(2l, 20l, arrayval+"2"),
            new Data(3l, 30l, arrayval+"3"),
            new Data(4l, 40l, arrayval+"4"),
            new Data(5l, 50l, arrayval+"5"),
            new Data(6l, 60l, arrayval+"6"),
            new Data(7l, 70l, arrayval+"7"),
            new Data(8l, 80l, arrayval+"8"),
            new Data(9l, 90l, arrayval+"9"),
            new Data(10l, 100l, arrayval+"10")
    );

    JavaRDD<Data> javaRDD = sc.parallelize(data);
    saveToEs(javaRDD, "index/type");

运行上面的代码会产生异常(堆栈跟踪) -

  

15/01/16 13:20:41 INFO spark.SecurityManager:将视图更改为:root   15/01/16 13:20:41 INFO spark.SecurityManager:将修改acls更改为:root   15/01/16 13:20:41 INFO spark.SecurityManager:SecurityManager:身份验证禁用; ui acls disabled;具有查看权限的用户:Set(root);具有修改权限的用户:Set(root)   15/01/16 13:20:41 INFO slf4j.Slf4jLogger:Slf4jLogger启动了   15/01/16 13:20:41 INFO Remoting:启动远程处理   15/01/16 13:20:41 INFO Remoting:Remoting开始了;听地址:[akka.tcp:// sparkDriver @ ct-0015:55586]   15/01/16 13:20:41 INFO util.Utils:在端口55586上成功启动了'sparkDriver'服务。   15/01/16 13:20:41 INFO spark.SparkEnv:注册MapOutputTracker   15/01/16 13:20:41 INFO spark.SparkEnv:注册BlockManagerMaster   15/01/16 13:20:41 INFO storage.DiskBlockManager:在/ tmp / spark-local-20150116132041-f924创建本地目录   15/01/16 13:20:41 INFO storage.MemoryStore:MemoryStore的容量为2.3 GB   15/01/16 13:20:41 WARN util.NativeCodeLoader:无法为您的平台加载native-hadoop库...使用适用的builtin-java类   15/01/16 13:20:41 INFO spark.HttpFileServer:HTTP文件服务器目录是/ tmp / spark-a65b108f-e131-480a-85b2-ed65650cf991   15/01/16 13:20:42 INFO spark.HttpServer:启动HTTP Server   15/01/16 13:20:42 INFO server.Server:jetty-8.1.14.v20131031   15/01/16 13:20:42 INFO server.AbstractConnector:已启动SocketConnector@0.0.0.0:34049   15/01/16 13:20:42 INFO util.Utils:在端口34049上成功启动服务'HTTP文件服务器'。   15/01/16 13:20:42 INFO server.Server:jetty-8.1.14.v20131031   15/01/16 13:20:42 INFO server.AbstractConnector:已启动SelectChannelConnector@0.0.0.0:4040   15/01/16 13:20:42 INFO util.Utils:在端口4040上成功启动了“SparkUI”服务。   15/01/16 13:20:42 INFO ui.SparkUI:在http://ct-0015:4040开始SparkUI   15/01/16 13:20:42 INFO client.AppClient $ ClientActor:连接到master spark:// ct-0094:7077 ...   15/01/16 13:20:42 INFO cluster.SparkDeploySchedulerBackend:使用应用ID app-20150116131933-0078连接到Spark群集   15/01/16 13:20:42 INFO netty.NettyBlockTransferService:在34762上创建的服务器   15/01/16 13:20:42 INFO storage.BlockManagerMaster:尝试注册BlockManager   15/01/16 13:20:42 INFO storage.BlockManagerMasterActor:注册块管理器ct-0015:34762,带2.3 GB RAM,BlockManagerId(,ct-0015,34762)   15/01/16 13:20:42 INFO storage.BlockManagerMaster:已注册的BlockManager   15/01/16 13:20:42 INFO cluster.SparkDeploySchedulerBackend:SchedulerBackend已准备好在达到minRegisteredResourcesRatio后开始进行调度:0.0   15/01/16 13:20:43 INFO spark.SparkContext:在http://192.168.50.103:34049/jars/SparkPOC-0.0.1-SNAPSHOT-jar-with-dependencies.jar添加了JAR ./target/SparkPOC-0.0.1-SNAPSHOT-jar-with-dependencies.jar,时间戳为1421394643161   线程“main”中的异常java.lang.IncompatibleClassChangeError:实现类       at java.lang.ClassLoader.defineClass1(Native Method)       at java.lang.ClassLoader.defineClass(ClassLoader.java:760)       at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)       at java.net.URLClassLoader.defineClass(URLClassLoader.java:455)       在java.net.URLClassLoader.access $ 100(URLClassLoader.java:73)       在java.net.URLClassLoader $ 1.run(URLClassLoader.java:367)       在java.net.URLClassLoader $ 1.run(URLClassLoader.java:361)       at java.security.AccessController.doPrivileged(Native Method)       在java.net.URLClassLoader.findClass(URLClassLoader.java:360)       at java.lang.ClassLoader.loadClass(ClassLoader.java:424)       at sun.misc.Launcher $ AppClassLoader.loadClass(Launcher.java:308)       at java.lang.ClassLoader.loadClass(ClassLoader.java:357)       在org.elasticsearch.spark.rdd.EsSpark $ .saveToEs(EsSpark.scala:30)       在org.elasticsearch.spark.rdd.EsSpark $ .saveToEs(EsSpark.scala:24)       在org.elasticsearch.spark.rdd.api.java.JavaEsSpark $ .saveToEs(JavaEsSpark.scala:28)       在org.elasticsearch.spark.rdd.api.java.JavaEsSpark.saveToEs(JavaEsSpark.scala)       在com.cleartrail.spark.poc.elasticsearch.demo.ESPerformerClass.main(ESPerformerClass.java:39)

我在pom.xml中有以下依赖项 -

 <dependencies>
  <dependency> <!-- Spark dependency -->
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-core_2.10</artifactId>
      <version>1.2.0</version>
  </dependency>
  <dependency>
      <groupId>org.spark-project</groupId>
      <artifactId>spark-streaming_2.9.2</artifactId>
      <version>0.7.0</version>
  </dependency>
  <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-streaming_2.10</artifactId>
      <version>1.2.0</version>
  </dependency>
  <dependency>
      <groupId>org.elasticsearch</groupId>
      <artifactId>elasticsearch-hadoop</artifactId>
      <version>2.0.2</version>
  </dependency>
 </dependencies>

我正在使用ElastiSearch-0.90.3,Apache Spark-1.2.0

是否有任何版本不匹配?或者不推荐使用saveToEs方法吗?

1 个答案:

答案 0 :(得分:0)

实际上,版本不匹配,但不是Elasticsearch和elasticsearch-hadoop,而是弹性搜索和火花。 elasticsearch的集成仅适用于Spark 1.1.0(尚未)。所以我改变了Spark的版本,我的问题就消失了。