Spark无法通过HTTP和Load Balancer连接到Elasticsearch2.4

时间:2019-06-19 14:43:29

标签: apache-spark elasticsearch elasticsearch-spark

我们的elasticsearch在负载均衡器后面运行。负载平衡器的URL为https://es.mycomp.com。我可以从邮递员和curl处向其发布文件。因此,防火墙已为我的开发箱打开。但是当我从Spark发布文档时,我一直在获取:

19/06/19 09:35:49 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request: Connection refused: connect
19/06/19 09:35:49 INFO HttpMethodDirector: Retrying request
19/06/19 09:35:49 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request: Connection refused: connect
19/06/19 09:35:49 INFO HttpMethodDirector: Retrying request
19/06/19 09:35:50 ERROR NetworkClient: Node [10.127.30.46:433] failed (Connection refused: connect); no other nodes left - aborting...
19/06/19 09:35:50 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Cannot detect ES version - typically this happens if the network/Elasticsearch cluster is not accessible or when targeting a WAN/Cloud instance without the proper setting 'es.nodes.wan.only'
    at org.elasticsearch.hadoop.rest.InitializationUtils.discoverEsVersion(InitializationUtils.java:196)
    at org.elasticsearch.hadoop.rest.RestService.createWriter(RestService.java:379)
    at org.elasticsearch.spark.rdd.EsRDDWriter.write(EsRDDWriter.scala:40)
    at org.elasticsearch.spark.rdd.EsSpark$$anonfun$doSaveToEs$1.apply(EsSpark.scala:84)
    at org.elasticsearch.spark.rdd.EsSpark$$anonfun$doSaveToEs$1.apply(EsSpark.scala:84)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
    at org.apache.spark.scheduler.Task.run(Task.scala:89)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: org.elasticsearch.hadoop.rest.EsHadoopNoNodesLeftException: Connection error (check network and/or proxy settings)- all nodes failed; tried [[10.127.30.46:433]] 
    at org.elasticsearch.hadoop.rest.NetworkClient.execute(NetworkClient.java:142)
    at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:434)
    at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:414)
    at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:418)
    at org.elasticsearch.hadoop.rest.RestClient.get(RestClient.java:122)
    at org.elasticsearch.hadoop.rest.RestClient.esVersion(RestClient.java:564)
    at org.elasticsearch.hadoop.rest.InitializationUtils.discoverEsVersion(InitializationUtils.java:184)
    ... 10 more

这是我的代码:

    val conf = new SparkConf().setMaster("local[2]").setAppName("ESPost")
      .set("es.index.auto.create", "true")
      .set("es.nodes", "es.mycomp.com")
      .set("es.port", "443")
      .set("es.nodes.client.only", "true")
      .set("es.http.timeout", "5m")
      .set("es.scroll.size", "50")

我们正在使用elasticsearch2.4,这是我的配置:

scalaVersion := "2.11.12"

val sparkVersion = "1.3.0"

libraryDependencies ++= Seq(
  "org.apache.spark" %% "spark-core" % sparkVersion,
  "org.apache.spark" %% "spark-sql" % sparkVersion,
  "org.elasticsearch" %% "elasticsearch-spark" % "2.4.5",
  "org.apache.spark" % "spark-streaming_2.11" % sparkVersion
)

0 个答案:

没有答案