从发布/订阅到ElasticSearch的Apache Beam流

时间:2019-11-20 18:41:44

标签: java elasticsearch google-cloud-dataflow apache-beam elasticsearch-rest-client

我正在用Apache Beam编写一个Java流管道,该管道从Google Cloud PubSub读取消息,并将其写入ElasticSearch实例。目前,我正在使用直接运行程序,但是计划是在Google Cloud Dataflow上部署解决方案。

首先,我编写了一个从PubSub读取并写入文本文件的管道,该管道可以工作。然后,我坐上了ElasticSearch实例,这也可行。我写一些带有卷曲的文档,很容易。

然后,当我尝试使用Beam的ElasticSearch连接器执行写入操作时,我开始出现一些错误。实际上,尽管我已将依存关系添加到pom.xml文件中,但仍然得到ava.lang.NoSuchMethodError: org.elasticsearch.client.RestClient.performRequest

我在本质上是这样的:

messages.apply(
                        "TwoMinWindow",
                        Window.into(FixedWindows.of(new Duration(120*1000)))
                ).apply(
                        "ElasticWrite",
            ElasticsearchIO.write()
            .withConnectionConfiguration(
                             ElasticsearchIO.ConnectionConfiguration
                             .create(new String[]{"http://xxx.xxx.xxx.xxx:9200"}, "streaming_data", "string")
                             .withUsername("xxxx")
                             .withPassword("xxxxxxxx")
                             )
                );

使用DirectRunner,我可以连接到PubSub,但是当管道尝试与ElasticSearch实例连接时,我得到了一个例外:

java.lang.NoSuchMethodError: org.elasticsearch.client.RestClient.performRequest(Ljava/lang/String;Ljava/lang/String;[Lorg/apache/http/Header;)Lorg/elasticsearch/client/Response;
    at org.apache.beam.sdk.util.UserCodeException.wrap (UserCodeException.java:34)
    at org.apache.beam.sdk.io.elasticsearch.ElasticsearchIO$Write$WriteFn$DoFnInvoker.invokeSetup (Unknown Source)
    at org.apache.beam.sdk.transforms.reflect.DoFnInvokers.tryInvokeSetupFor (DoFnInvokers.java:50)
    at org.apache.beam.runners.direct.DoFnLifecycleManager$DeserializingCacheLoader.load (DoFnLifecycleManager.java:104)
    at org.apache.beam.runners.direct.DoFnLifecycleManager$DeserializingCacheLoader.load (DoFnLifecycleManager.java:91)
    at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture (LocalCache.java:3528)
    at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.loadSync (LocalCache.java:2277)
    at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad (LocalCache.java:2154)
    at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get (LocalCache.java:2044)
    at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.get (LocalCache.java:3952)
    at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.getOrLoad (LocalCache.java:3974)
    at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.get (LocalCache.java:4958)
    at org.apache.beam.runners.direct.DoFnLifecycleManager.get (DoFnLifecycleManager.java:61)
    at org.apache.beam.runners.direct.ParDoEvaluatorFactory.createEvaluator (ParDoEvaluatorFactory.java:129)
    at org.apache.beam.runners.direct.ParDoEvaluatorFactory.forApplication (ParDoEvaluatorFactory.java:79)
    at org.apache.beam.runners.direct.TransformEvaluatorRegistry.forApplication (TransformEvaluatorRegistry.java:169)
    at org.apache.beam.runners.direct.DirectTransformExecutor.run (DirectTransformExecutor.java:117)
    at java.util.concurrent.Executors$RunnableAdapter.call (Executors.java:511)
    at java.util.concurrent.FutureTask.run (FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:624)
    at java.lang.Thread.run (Thread.java:748)
Caused by: java.lang.NoSuchMethodError: org.elasticsearch.client.RestClient.performRequest(Ljava/lang/String;Ljava/lang/String;[Lorg/apache/http/Header;)Lorg/elasticsearch/client/Response;
    at org.apache.beam.sdk.io.elasticsearch.ElasticsearchIO.getBackendVersion (ElasticsearchIO.java:1348)
    at org.apache.beam.sdk.io.elasticsearch.ElasticsearchIO$Write$WriteFn.setup (ElasticsearchIO.java:1200)

我在pom.xml中添加的是:

    <dependency>
    <groupId>org.apache.beam</groupId>
    <artifactId>beam-sdks-java-io-google-cloud-platform</artifactId>
    <version>${beam.version}</version>
  </dependency>

    <!-- https://mvnrepository.com/artifact/org.elasticsearch.client/elasticsearch-rest-client -->
  <dependency>
    <groupId>org.elasticsearch.client</groupId>
    <artifactId>elasticsearch-rest-client</artifactId>
    <version>${elastic.version}</version>
  </dependency>

我一直困扰于这个问题,我不知道如何解决。如果使用JestClient,则可以毫无问题地连接到ElasticSearch。

您有什么建议吗?

1 个答案:

答案 0 :(得分:2)

您正在使用没有方法RestClient的{​​{1}}的较新版本。如果您查看the latest source code,则可以看到该方法现在使用performRequest(String, Header),而在旧版本的there were methods that took Strings and Headers中。 不建议使用这些方法,然后removed from the code on September 1, 2018

要么更改代码以使用较新的Elastic Search库,要么指定与您的代码兼容的库的较旧版本(它必须在Request之前,例如6.8.4)。< / p>

相关问题