我正在尝试创建一个数据管道,其中Logstash jdbc插件每5分钟使用SQL查询获取一些数据,ElasticSearch输出插件将数据从输入插件放入ElasticSearch服务器。换句话说,我希望此输出插件部分更新ElasticSearch服务器中的现有文档。我的Logstash配置文件如下所示:
input {
jdbc {
jdbc_driver_library => "/Users/hello/logstash-2.3.2/lib/mysql-connector-java-5.1.34.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
jdbc_connection_string => "jdbc:mysql://localhost:13306/mysqlDB”
jdbc_user => “root”
jdbc_password => “1234”
last_run_metadata_path => "/Users/hello/.logstash_last_run_display"
statement => "SELECT * FROM checkout WHERE checkout_no between :sql_last_value + 1 and :sql_last_value + 5 ORDER BY checkout_no ASC"
schedule => “*/5 * * * *"
use_column_value => true
tracking_column => “checkout_no”
}
}
output {
stdout { codec => json_lines }
elasticsearch {
action => "update"
index => "ecs"
document_type => “checkout”
document_id => “%{checkout_no}"
hosts => ["localhost:9200"]
}
}
问题是ElasticSearch输出插件似乎不会调用部分更新API,例如/ {index} / {type} / {id} / _ update。本手册只列出index
,delete
,create
,update
等操作,但它没有提及每个动作调用哪个REST API URL,即是否{{} 1}}动作调用/ {index} / {type} / {id} / _ update或/ {index} / {type} / {id} api(upsert)。我想从弹性搜索输出插件中调用部分更新api?有可能吗?
答案 0 :(得分:9)
将doc_as_upsert => true
和action => "update"
设置为我的生产脚本。
output {
elasticsearch {
hosts => ["es_host"]
document_id => "%{id}" # !!! the id here MUST be the same
index => "logstash-my-index"
timeout => 30
workers => 1
doc_as_upsert => true
action => "update"
}
}
答案 1 :(得分:3)
有可能。 Elasticsearch输出插件有一系列upsert
选项,与Elasticsearch更新API中的选项相对应:
upsert
本身:https://www.elastic.co/guide/en/logstash/current/plugins-outputs-elasticsearch.html#plugins-outputs-elasticsearch-upsert scripted_upsert
:https://www.elastic.co/guide/en/logstash/current/plugins-outputs-elasticsearch.html#plugins-outputs-elasticsearch-scripted_upsert doc_as_upsert
:https://www.elastic.co/guide/en/logstash/current/plugins-outputs-elasticsearch.html#plugins-outputs-elasticsearch-doc_as_upsert