我有以下索引:
POST /cars/transactions/_bulk
{ "index": {}}
{ "price" : 10000, "color" : "red", "make" : "honda", "sold" : "2014-10-28" }
{ "index": {}}
{ "price" : 20000, "color" : "red", "make" : "honda", "sold" : "2014-11-05" }
{ "index": {}}
{ "price" : 30000, "color" : "green", "make" : "ford", "sold" : "2014-05-18" }
{ "index": {}}
{ "price" : 15000, "color" : "blue", "make" : "toyota", "sold" : "2014-07-02" }
{ "index": {}}
{ "price" : 12000, "color" : "green", "make" : "toyota", "sold" : "2014-08-19" }
{ "index": {}}
{ "price" : 20000, "color" : "red", "make" : "honda", "sold" : "2014-11-05" }
{ "index": {}}
{ "price" : 80000, "color" : "red", "make" : "bmw", "sold" : "2014-01-01" }
{ "index": {}}
{ "price" : 25000, "color" : "blue", "make" : "ford", "sold" : "2014-02-12" }
我正在执行以下搜索:
GET /cars/transactions/_search
{
"size" : 0,
"aggs" : {
"popular_colors" : {
"terms" : {
"field" : "color"
}
}
}
}
我收到的回复如下:
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 8,
"max_score": 0,
"hits": []
},
"aggregations": {
"popular_colors": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "red",
"doc_count": 4
},
{
"key": "blue",
"doc_count": 2
},
{
"key": "green",
"doc_count": 2
}
]
}
}
}
我的问题是,如何将该文档重新编入不同的索引?
我试过了:
input {
elasticsearch {
hosts => "localhost"
index => "cars"
query => '{
"size" : 0,
"aggs" : {
"popular_colors" : {
"terms" : {
"field" : "color"
}
}
}
}'
size => 500
scroll => "5m"
docinfo => true
}
}
但它不起作用,因为插件的 search_type 是扫描而它不支持聚合。
我也尝试过:
input {
file {
path => "C:\ELK-STACK\logstash-2.3.4\bin\out.json"
start_position => "beginning"
codec => json_lines }
}
out.json的内容是:
{"took":1,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":8,"max_score":1.0,"hits":[{"_index":"cars","_type":"transactions","_id":"AVexGB7_99OIq3MORm7l","_score":1.0,"_source":{ "price" : 10000, "color" : "red", "make" : "honda", "sold" : "2014-10-28" }},{"_index":"cars","_type":"transactions","_id":"AVexGB7_99OIq3MORm7m","_score":1.0,"_source":{ "price" : 20000, "color" : "red", "make" : "honda", "sold" : "2014-11-05" }},{"_index":"cars","_type":"transactions","_id":"AVexGB7_99OIq3MORm7p","_score":1.0,"_source":{ "price" : 12000, "color" : "green", "make" : "toyota", "sold" : "2014-08-19" }},{"_index":"cars","_type":"transactions","_id":"AVexGB7_99OIq3MORm7o","_score":1.0,"_source":{ "price" : 15000, "color" : "blue", "make" : "toyota", "sold" : "2014-07-02" }},{"_index":"cars","_type":"transactions","_id":"AVexGB7_99OIq3MORm7n","_score":1.0,"_source":{ "price" : 30000, "color" : "green", "make" : "ford", "sold" : "2014-05-18" }},{"_index":"cars","_type":"transactions","_id":"AVexGB7_99OIq3MORm7q","_score":1.0,"_source":{ "price" : 20000, "color" : "red", "make" : "honda", "sold" : "2014-11-05" }},{"_index":"cars","_type":"transactions","_id":"AVexGB7_99OIq3MORm7r","_score":1.0,"_source":{ "price" : 80000, "color" : "red", "make" : "bmw", "sold" : "2014-01-01" }},{"_index":"cars","_type":"transactions","_id":"AVexGB7_99OIq3MORm7s","_score":1.0,"_source":{ "price" : 25000, "color" : "blue", "make" : "ford", "sold" : "2014-02-12" }}]}}
但
后它没有产生任何输出设置:默认管道工人:8
管道主要开始
我认为这是因为json文件没有为json插件准备,我需要做一些准备(比如使用Java API),但我想尽可能避免这种情况。
谢谢!
答案 0 :(得分:0)
正如您所注意到的,elasticsearch
输入插件不支持聚合。可以使用http_poller
输入插件,以便定期(或每天只发送一次)向Elasticsearch发送聚合查询。然后使用elasticsearch
输出,您可以再次将结果聚合发送给ES。
配置基本上是这样的(请注意,聚合查询需要进行URL编码并使用source=...
parameter发送到ES)。
input {
http_poller {
urls => {
test1 => 'http://localhost:9200/cars/transactions/_search?source=%7B%22size%22%3A0%2C%22aggs%22%3A%7B%22popular_colors%22%3A%7B%22terms%22%3A%7B%22field%22%3A%22color%22%7D%7D%7D%7D'
}
# checking once per day
interval => 86400
codec => "json"
}
}
filter {
}
output {
elasticsearch {
hosts => ["localhost:9200"]
index => "my_aggs"
}
}