如何在多索引搜索中按索引获取聚合聚集?

时间:2019-08-29 08:31:42

标签: elasticsearch elastic-stack elasticsearch-aggregation

我有一个针对单个索引的聚合查询,aggs看起来像:

"aggs":{  
    "my_buckets":{  
      "composite":{  
        "size":1000,
        "sources":[  
          {  
            "checksumField":{  
              "terms":{  
               "field":"checkSum.keyword"
              }
            }
          }
        ]
      },
      "aggs":{  
        "catagories":{  
          "top_hits":{  
            "sort":[  
              {  
                "createdDate":{  
                  "order":"desc"
                 }
              }
            ],
            "size":1,
            "_source":[  
             "some_field"
            ]
          }
        }
      }
    }
  }

这对于单个索引是需要的,但是当我在GET uri中包括多个索引作为逗号分隔的值时,如果第一个索引本身具有很多条目(例如1000),则无法看到其他索引的结果索引,因为聚合结果的最大大小设置为1000,但是我需要的是所有索引的热门搜索(例如,如果有两个索引,则每个索引的前500个),如何修改aggs主体以获取这种聚合结果

2 个答案:

答案 0 :(得分:0)

user-agent: ReactorNetty/0.8.6.RELEASE accept: */* upgrade: websocket connection: upgrade sec-websocket-key: 9f/MaXndN9yjhdNtN4undA== host: echo.websocket.org origin: https://echo.websocket.org sec-websocket-version: 13 16:24:21.302 [reactor-http-kqueue-4] DEBUG reactor.netty.resources.PooledConnectionProvider - [id: 0x7db149b5, L:/192.168.1.107:62196 - R:echo.websocket.org/174.129.224.73:443] onStateChange(ws{uri=/, connection=PooledConnection{channel=[id: 0x7db149b5, L:/192.168.1.107:62196 - R:echo.websocket.org/174.129.224.73:443]}}, [response_received]) 16:24:21.307 [reactor-http-kqueue-4] DEBUG org.springframework.web.reactive.socket.adapter.ReactorNettyWebSocketSession - [34a8edc] Session id "34a8edc" for wss://echo.websocket.org 16:24:21.307 [reactor-http-kqueue-4] DEBUG org.springframework.web.reactive.socket.client.ReactorNettyWebSocketClient - Started session '34a8edc' for wss://echo.websocket.org 16:24:21.344 [reactor-http-kqueue-4] DEBUG reactor.netty.ReactorNetty - [id: 0x7db149b5, L:/192.168.1.107:62196 - R:echo.websocket.org/174.129.224.73:443] Added decoder [WebSocketFrameAggregator] at the end of the user pipeline, full pipeline: [reactor.left.sslHandler, reactor.left.httpCodec, ws-decoder, ws-encoder, WebSocketFrameAggregator, reactor.right.reactiveBridge, DefaultChannelPipeline$TailContext#0] 16:24:21.350 [reactor-http-kqueue-4] INFO reactor.Flux.Peek.2 - onSubscribe(FluxPeek.PeekSubscriber) rcv stream subscribed 16:24:21.351 [reactor-http-kqueue-4] INFO reactor.Flux.Peek.2 - request(unbounded) 16:24:21.354 [reactor-http-kqueue-4] DEBUG reactor.netty.channel.ChannelOperationsHandler - [id: 0x7db149b5, L:/192.168.1.107:62196 - R:echo.websocket.org/174.129.224.73:443] Writing object send stream subscribed 16:24:21.354 [reactor-http-kqueue-4] DEBUG reactor.netty.channel.FluxReceive - [id: 0x7db149b5, L:/192.168.1.107:62196 - R:echo.websocket.org/174.129.224.73:443] Subscribing inbound receiver [pending: 0, cancelled:false, inboundDone: false] 16:24:21.358 [elastic-3] INFO reactor.Flux.ConcatMap.1 - onSubscribe(FluxConcatMap.ConcatMapImmediate) 16:24:21.359 [elastic-3] INFO reactor.Flux.ConcatMap.1 - request(32) 16:24:22.366 [parallel-1] INFO reactor.Flux.ConcatMap.1 - onNext(msg-1) 16:24:22.370 [elastic-3] INFO reactor.Flux.ConcatMap.1 - request(1) 16:24:23.362 [reactor-http-kqueue-4] DEBUG io.netty.handler.codec.http.websocketx.WebSocket08FrameEncoder - Encoding WebSocket Frame opCode=1 length=5 16:24:23.374 [parallel-2] INFO reactor.Flux.ConcatMap.1 - onNext(msg-2) 16:24:23.375 [reactor-http-kqueue-4] DEBUG io.netty.handler.codec.http.websocketx.WebSocket08FrameEncoder - Encoding WebSocket Frame opCode=1 length=5 16:24:23.375 [elastic-3] INFO reactor.Flux.ConcatMap.1 - request(1) 16:24:24.380 [parallel-3] INFO reactor.Flux.ConcatMap.1 - onNext(msg-3) 16:24:24.380 [reactor-http-kqueue-4] DEBUG io.netty.handler.codec.http.websocketx.WebSocket08FrameEncoder - Encoding WebSocket Frame opCode=1 length=5 16:24:24.381 [elastic-3] INFO reactor.Flux.ConcatMap.1 - request(1) 16:24:25.384 [parallel-4] INFO reactor.Flux.ConcatMap.1 - onNext(msg-4) 数组中,您可以在sources字段上添加terms聚合:

_index

答案 1 :(得分:0)

解决了这个问题,下面是aggs部分,它通过索引返回复合存储桶

GET index1,index2,index3/type/_search

 "aggs": {
    "my_buckets": {
      "composite": {
        "size": 3,
        "sources": [
          {
            "indexAgg": {
              "terms": {
                "field": "_index"
              }
            }
          }
        ]
      },
      "aggs": {
        "checksumField": {
          "terms": {
            "field": "checkSum.keyword",
            "size":2
          },
          "aggs": {
            "catagories": {
              "top_hits": {
                "sort": [
                  {
                    "createdDate": {
                      "order": "desc"
                    }
                  }
                ],
                "size": 1,
                "_source": [
                  "some_field"
                ]
              }
            }
          }
        }
      }
    }
  }

所产生的聚合产生三个主存储桶(用于三个模板),并在每个2个存储桶内(这是我需要根据所提供的模板数量计算的大小,通过除以1000来平均),基于校验和字段,如下所示由问题中的原始查询返回。因此,有了这些更改,我就能获得每个索引的固定点击数。