Elasticsearch Spark连接用于结构化流

时间:2019-07-01 10:27:27

标签: apache-spark elasticsearch spark-structured-streaming

我正在尝试从我的Spark程序连接到elasticsearch。 我的elasticsearch主机是https,没有找到连接属性。 我们正在使用Spark构造的流Java API,连接细节如下,

        SparkSession spark = SparkSession.builder()
                .config(ConfigurationOptions.ES_NET_HTTP_AUTH_USER, "username")
                .config(ConfigurationOptions.ES_NET_HTTP_AUTH_PASS, "password")
                 .config(ConfigurationOptions.ES_NODES, "my_host_url")
                 .config(ConfigurationOptions.ES_PORT, "9200")
.config(ConfigurationOptions.ES_NET_SSL_TRUST_STORE_LOCATION,"C:\\certs\\elastic\\truststore.jks")
.config(ConfigurationOptions.ES_NET_SSL_TRUST_STORE_PASS,"my_password") .config(ConfigurationOptions.ES_NET_SSL_KEYSTORE_TYPE,"jks")
                .master("local[2]")
                .appName("spark_elastic").getOrCreate();
        spark.conf().set("spark.sql.shuffle.partitions",2);
        spark.conf().set("spark.default.parallelism",2);

我收到以下错误

19/07/01 12:26:00 INFO HttpMethodDirector: I/O exception (org.apache.commons.httpclient.NoHttpResponseException) caught when processing request: The server 10.xx.xxx.xxx failed to respond
19/07/01 12:26:00 INFO HttpMethodDirector: Retrying request
19/07/01 12:26:00 ERROR NetworkClient: Node [10.xx.xxx.xxx:9200] failed (The server 10.xx.xxx.xxx failed to respond); no other nodes left - aborting...
19/07/01 12:26:00 ERROR StpMain: Error
org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Cannot detect ES version - typically this happens if the network/Elasticsearch cluster is not accessible or when targeting a WAN/Cloud instance without the proper setting 'es.nodes.wan.only'
    at org.elasticsearch.hadoop.rest.InitializationUtils.discoverClusterInfo(InitializationUtils.java:344)

可能是因为它尝试通过http协议启动连接,但是在我的情况下,我需要https连接,但不确定如何配置

1 个答案:

答案 0 :(得分:0)

发生错误,因为spark无法找到信任库文件。看来我们需要添加“ file:\\”以使路径被接受。