Java-无法将Elasticsearch与Spark连接

时间:2019-09-14 18:32:25

标签: java apache-spark hadoop elasticsearch

当我尝试将Elasticsearch与Spark连接时出现此错误:

19/09/14 20:17:32 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request: Connection timed out: connect
19/09/14 20:17:32 INFO HttpMethodDirector: Retrying request
19/09/14 20:17:53 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request: Connection timed out: connect
19/09/14 20:17:53 INFO HttpMethodDirector: Retrying request
19/09/14 20:18:14 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request: Connection timed out: connect
19/09/14 20:18:14 INFO HttpMethodDirector: Retrying request
19/09/14 20:18:35 ERROR NetworkClient: Node [xx.xx.xx.10:7474] failed (Connection timed out: connect); selected next node [xx.xx.xx.11:7474]
19/09/14 20:18:56 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request: Connection timed out: connect
19/09/14 20:18:56 INFO HttpMethodDirector: Retrying request
19/09/14 20:21:23 ERROR NetworkClient: Node [xx.xx.xx.12:7474] failed (Connection timed out: connect); no other nodes left - aborting...
19/09/14 20:21:23 ERROR Executor: Exception in task 0.0 in stage 1.0 (TID 1)
org.elasticsearch.hadoop.rest.EsHadoopNoNodesLeftException: Connection error (check network and/or proxy settings)- all nodes failed; tried [[xx.xx.xx.10:7474, xx.xx.xx.11:7474, xx.xx.xx.12:7474]] 
    at org.elasticsearch.hadoop.rest.NetworkClient.execute(NetworkClient.java:149)
    at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:461)
    at org.elasticsearch.hadoop.rest.RestClient.executeNotFoundAllowed(RestClient.java:469)
    at org.elasticsearch.hadoop.rest.RestClient.exists(RestClient.java:565)
    at org.elasticsearch.hadoop.rest.RestClient.indexExists(RestClient.java:560)
    at org.elasticsearch.hadoop.rest.RestClient.touch(RestClient.java:571)
    at org.elasticsearch.hadoop.rest.RestRepository.touch(RestRepository.java:418)
    at org.elasticsearch.hadoop.rest.RestService.initSingleIndex(RestService.java:609)
    at org.elasticsearch.hadoop.rest.RestService.createWriter(RestService.java:597)
    at org.elasticsearch.spark.rdd.EsRDDWriter.write(EsRDDWriter.scala:58)
    at org.elasticsearch.spark.rdd.EsSpark$$anonfun$doSaveToEs$1.apply(EsSpark.scala:107)
    at org.elasticsearch.spark.rdd.EsSpark$$anonfun$doSaveToEs$1.apply(EsSpark.scala:107)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
    at org.apache.spark.scheduler.Task.run(Task.scala:108)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
    at java.lang.Thread.run(Unknown Source)
19/09/14 20:21:23 WARN TaskSetManager: Lost task 0.0 in stage 1.0 (TID 1, localhost, executor driver): org.elasticsearch.hadoop.rest.EsHadoopNoNodesLeftException: Connection error (check network and/or proxy settings)- all nodes failed; tried [[xx.xx.xx.10:7474, xx.xx.xx.11:7474, xx.xx.xx.12:7474]] 
    at org.elasticsearch.hadoop.rest.NetworkClient.execute(NetworkClient.java:149)
    at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:461)
    at org.elasticsearch.hadoop.rest.RestClient.executeNotFoundAllowed(RestClient.java:469)
    at org.elasticsearch.hadoop.rest.RestClient.exists(RestClient.java:565)
    at org.elasticsearch.hadoop.rest.RestClient.indexExists(RestClient.java:560)
    at org.elasticsearch.hadoop.rest.RestClient.touch(RestClient.java:571)
    at org.elasticsearch.hadoop.rest.RestRepository.touch(RestRepository.java:418)
    at org.elasticsearch.hadoop.rest.RestService.initSingleIndex(RestService.java:609)
    at org.elasticsearch.hadoop.rest.RestService.createWriter(RestService.java:597)
    at org.elasticsearch.spark.rdd.EsRDDWriter.write(EsRDDWriter.scala:58)
    at org.elasticsearch.spark.rdd.EsSpark$$anonfun$doSaveToEs$1.apply(EsSpark.scala:107)
    at org.elasticsearch.spark.rdd.EsSpark$$anonfun$doSaveToEs$1.apply(EsSpark.scala:107)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
    at org.apache.spark.scheduler.Task.run(Task.scala:108)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
    at java.lang.Thread.run(Unknown Source)

代码如下:

        SparkConf conf = new SparkConf()
                .setAppName("ES-Spark")
                .("local[*]")
                .set("es.nodes", "cluster-xxxx.xx.xx.xx")
                .set("es.port", "50467")
                .set("es.resource", "recommendations_2/recommendation")
                .set("es.resource.read", "products-20190912/product")
                .set("es.resource.write", "recommendations_2/recommendation")
                ;
        JavaSparkContext ctx = new JavaSparkContext(conf);
        JavaStreamingContext jsc = new JavaStreamingContext(ctx, new Duration(10000));
        JavaInputDStream<ConsumerRecord<String, String>> kafkaStream = Utils.getKafkaStream(jsc);       
        JavaDStream<String> allInfo = kafkaStream.map(f -> f.value());
        JavaEsSparkStreaming.saveToEs(allInfo, "recommendations_2/recommendation");   
        jsc.start();
        jsc.awaitTermination();

似乎它到达了群集,但无法重新连接...

使用的依赖项是:

<dependency>
  <groupId>org.elasticsearch</groupId>
  <artifactId>elasticsearch-spark-20_2.10</artifactId>
  <version>5.6.2</version>
</dependency>

当我尝试通过传输客户端进行连接时,一切都很好,我能够做到。

我已经检查了 elasticsearch.yml 文件,并且 http.port 属性已正确设置为正确的端口。

0 个答案:

没有答案