将数据集数据发送到elasticsearch

时间:2018-09-04 09:40:19

标签: java elasticsearch apache-flink

我正在尝试使用新的elasticsearch连接器将数据集中的一些数据发送到elasticsearch中,但是除了这里用于数据流结构的资源,我找不到其他资源:

https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/elasticsearch.html

我的数据集是行的数据集(来自sql查询),这是内容:

199947,6
199958,3
199964,2
199985,2

我创建了一个静态嵌套类,该类实现了 ElasticsearchSinkFunction

public static class NumberOfTransactionsByBlocks implements ElasticsearchSinkFunction<Row> {

    public void process(Row element, RuntimeContext ctx, RequestIndexer indexer) {
        indexer.add(createIndexRequest(element));

    }

    public IndexRequest createIndexRequest(Row element) {
        Map<String, String> json = new HashMap<>();
        json.put("block_number", element.getField(0).toString());
        json.put("numberOfTransactions", element.getField(1).toString());

        return Requests.indexRequest()
                .index("nbOfTransactionsByBlocks")
                .type("count-transactions")
                .source(json);
    }
}

然后我的问题是我不知道如何发送内部类的实例...

DataSet<Row> data = tableEnv.toDataSet(sqlResult, Row.class);
List<HttpHost> httpHosts = new ArrayList<>();
httpHosts.add(new HttpHost("127.0.0.1", 9200, "http"));
httpHosts.add(new HttpHost("10.2.3.1", 9200, "http"));

Map<String, String> config = new HashMap<>();
config.put("bulk.flush.max.actions", "1");   // flush inserts after every event
config.put("cluster.name", "elasticsearch"); // default cluster name


data.output(new ElasticsearchSink<>(config, httpHosts, new NumberOfTransactionsByBlocks()));

实例化ElasticsearchSink时显示错误:

  

无法推断参数

但是当我指定类型(行)时,它说:

  

ElasticsearchSink(java.util.Map,   java.util.List,   org.apache.flink.streaming.connectors.elasticsearch.ElasticsearchSinkFunction,   org.apache.flink.streaming.connectors.elasticsearch.ActionRequestFailureHandler,   org.apache.flink.streaming.connectors.elasticsearch6.RestClientFactory)”   拥有私人访问权限   'org.apache.flink.streaming.connectors.elasticsearch6.ElasticsearchSink'

我做错什么了吗?

1 个答案:

答案 0 :(得分:0)

Flink当前为ElasticSearch提供了(1.6.0)个four different连接器。

  • v1.xset lcs=tab:>-,trail:·,nbsp:·,extends:>,precedes:<
  • v2.xflink-connector-elasticsearch_2.11
  • v5.xflink-connector-elasticsearch2_2.11
  • v6.xflink-connector-elasticsearch5_2.11

确保在项目中包含正确的maven依赖项。

  

...在flink-connector-elasticsearch6_2.11 org.apache.flink.streaming.connectors. elasticsearch6

中具有私人访问权限

现在,从您共享的跟踪中猜测,您似乎正在使用.ElasticsearchSink的依赖项。查看source,这表明他们已将构造函数移至v6.x并添加了Builder [commit]

因此,要添加ElasticsearchSink,您需要使用类似的东西:

private

此外,导入将

data.output(
  new ElasticsearchSink.Builder<>(httpHosts, new NumberOfTransactionsByBlocks())
    .setBulkFlushMaxActions(1)
    .build());