logstash每周获取一次多个事件,然后将这些事件转发给elasticsearch,
如何配置logstash让它告诉elasticsearch删除旧事件?
编辑2018-03-28:
{host:"host1", type:"packages", records: [{name:"pkg1", ver: "1"}, {name: "pkg2", ver: "2"},...]
{host:"host1", type:"mounts", records: [{path:"path1", dev: "dev1"}, {path:"path2", dev: "dev2"},...]
{host:"host1", type:"???", records: [{???}, {???},...]
...
{host:"host2", type:"packages, records: [{name:"pkg1", version: "1"}, {name: "pkg2", ver: "2"},...]
{host:"host2", type: "mounts", records: [{path:"path1", dev: "dev1"}, {path:"path2", dev: "dev2"},...]
{host:"host2", type:"???", records: [{???}, {???},...]
这是每个主机的各种事件。每个事件都包含一系列 无法确定的 架构。
为了能够精确地搜索数组中的字段,我必须将数组拆分成多个elasticsearch文档。
(我知道有一些方法可以不拆分但是能够在数组内搜索。这是另一个故事:Nested Object。在我的情况下,内部对象不是固定的模式,所以我不能提供每个内部预先定义字段)
{host: "host1", type:"packages", record: {name: "pkg1", ver: "1"}}
{host: "host1", type:"packages", record: {name: "pkg2", ver: "2"}}
{host: "host1", type:"mounts", record: {path: "path1", dev: "dev1"}}
{host: "host1", type:"???", record: {???}
{host: "host1", type:"???", record: {???}
{host: "host1", type:"mounts", record: {path: "path2", dev: "dev2"}}
{host: "host2", type:"packages", record: {name: "pkg1", ver: "1"}}
{host: "host2", type:"packages", record: {name: "pkg2", ver: "2"}}
{host: "host2", type:"mounts", record: {path: "path1", dev: "dev1"}}
{host: "host2", type:"mounts", record: {path: "path2", dev: "dev2"}}
{host: "host2", type:"???", record: {???}
{host: "host2", type:"???", record: {???}
...
input { ... }
filter {
split {
# split array and save them into new multiple events
field => "records"
}
mutate {
rename => { "records" => "record" }
}
}
output {
elasticsearch {
hosts => ["ELASTIC_IP:PORT"]
index => "packages-%{+YYYY.MM.dd}"
}
}
-
所以我想在获取新数据后删除主机的旧数据。
因为输出是多个文档,而不是单个文档,有时更多,有时更少,所以它不是简单的更新。它必须是一个全部删除&添加。
我知道有一些方法可以不拆分但能够在数组内搜索。这是另一个故事:Nested Object。在我的例子中,内部对象不是固定的模式,所以我不能事先提供每个内部字段定义
答案 0 :(得分:0)
好吧,我确认可以通过ruby过滤器来删除旧索引。
input { ... }
filter {
split {
# split array and save them into new multiple events
field => "records"
}
mutate {
rename => { "records" => "record" }
}
ruby {
init => "
require 'net/http'
require 'uri'
"
code => "
uri = URI.parse('http://docker.for.mac.localhost:19200/inventory-' + event.get('type') + '@' + event.get('host'))
http = Net::HTTP.new(uri.host, uri.port)
req = Net::HTTP::Delete.new(uri.request_uri)
req.basic_auth 'elastic', 'changeme'
res = http.request(req)
"
}
}
output {
elasticsearch {
hosts => ["ELASTIC_IP:PORT"]
index => "inventory-%{type}@%{host}"
}
}
重要的是为主机和类型的每个组合指定索引,以便在删除时可以轻松找到。
index => "inventory-%{type}@%{host}"