Question

我有一个代码：

import org.apache.spark.SparkConf
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.elasticsearch.spark._
val sc = new SparkContext(conf)
conf.set("es.index.auto.create", "true")
conf.set("es.nodes", "1.2.3.4")
val numbers = Map("one" -> 1, "two" -> 2, "three" -> 3)
val airports = Map("arrival" -> "Otopeni", "SFO" -> "San Fran")
sc.makeRDD(Seq(numbers, airports)).saveToEs("spark/docs")

但是当我运行它时，它会尝试转到Localhost：

sc.makeRDD(Seq(numbers, airports)).saveToEs("spark/docs")
19/06/11 11:56:16 ERROR rest.NetworkClient: Node [127.0.0.1:9200] failed (Connection refused (Connection refused)); no other nodes left - aborting...
19/06/11 11:56:16 ERROR rest.NetworkClient: Node [127.0.0.1:9200] failed (Connection refused (Connection refused)); no other nodes left - aborting...
19/06/11 11:56:16 ERROR executor.Executor: Exception in task 2.0 in stage 2.0 (TID 18)
org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Cannot detect ES version - typically this happens if the network/Elasticsearch cluster is not accessible or when targeting a WAN/Cloud instance without the proper setting 'es.nodes.wan.only'

如何设置写入远程ES服务器？

Answer 1

请。参见configuration

es.nodes.discovery（默认为true）是发现es集群中的节点，还是仅使用es.nodes中给出的节点进行元数据查询。请注意，此设置仅在启动期间适用；之后，在读写时，除非启用es.nodes.client.only，否则es会使用目标索引分片（及其托管节点）。

将es.nodes.discovery设置为假
示例：

EsSpark.saveToEs(userTweetRDD, "twitter/test", Map("es.nodes" -> "xx.xx.xx.xxx", "es.cluster.name" -> xxxx-xxxxx"))

添加

"es.nodes.discovery" -> "false"

就您而言

您的示例：

import org.apache.spark.SparkConf
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.elasticsearch.spark._


val conf:SparkConf = new SparkConf().setAppName("MYESAPP")
.setMaster("local")// for "local" for local testing if you are using yarn then "yarn"

conf.set("es.index.auto.create", "true")
conf.set("es.nodes", "1.2.3.4")
conf.set("es.nodes.discovery", "false")



val sc = new SparkContext(conf)

val numbers = Map("one" -> 1, "two" -> 2, "three" -> 3)
val airports = Map("arrival" -> "Otopeni", "SFO" -> "San Fran")
sc.makeRDD(Seq(numbers, airports)).saveToEs("spark/docs")

如何用Spark写入远程Elastic Search节点？

1 个答案: