Elasticsearch:在搜索查询期间生成聚合字段

时间:2018-02-02 18:18:30

标签: elasticsearch search aggregation

这里的ES新手。我试图在1个索引中使用以下模式从源实现搜索引擎:

index:paper
{
"title": string,
"author": string,
"id": string,
"references": [string:another_paper.id, string:another_paper.id, ...],
"pubDate": date
}

让我们说我想和作者一起搜索所有论文" A.史密斯" 2017-01-09至2017-01-30之间。

我如何制作我的搜索查询以获得带有生成字段的结果,该字段说明"引用"下的其他文档引用每个文档的次数。领域?这在ES中甚至可能吗?

执行速度并不重要,我可以忍受相对较慢的执行速度,但我不希望在上传新文档时更新现有文档。

谢谢

1 个答案:

答案 0 :(得分:0)

您绝对可以根据作者姓名和日期范围获得结果。 使用此查询,您可以获得与查询匹配的文档引用的文档数以及文档的计数。

简而言之,您可以根据其他文档获取参考文档的数量

例如,假设您索引3个文档

{
  "title": "title1",
  "author": "bob",
  "id": "id1",
  "references": [
    "id1",
    "id2",
    "id3"
  ],
  "pubDate": "01-01-2018"
},
{
  "title": "title2",
  "author": "harry",
  "id": "id2",
  "references": [
    "id1",
    "id3",
    "id7",
    "id8"
  ],
  "pubDate": "01-02-2018"
},
{
  "title": "title3",
  "author": "bob",
  "id": "id3",
  "references": [
    "id1",
    "id4",
    "id7",
    "id9"
  ],
  "pubDate": "01-03-2018"
}

在此之后,您可以触发查询

GET test_stackoverflow_agg/type1/_search
{
  "query": {
    "query_string": {
      "query": "author:bob AND pubDate:[2018-01-02 TO 2018-01-04]"
    }
  },
  "aggs": {
    "agg1": {
      "terms": {
        "field": "references",
        "size": 10
      }
    }
  }
}

查询部分将告诉您要过滤哪些文档和

聚合部分将告诉您要在哪个字段中获取参考字段中存在的唯一ID的数量

以下是

的结果
{
  "took": 5,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 1.0460204,
    "hits": [
      {
        "_index": "test_stackoverflow_agg",
        "_type": "type1",
        "_id": "id3",
        "_score": 1.0460204,
        "_source": {
          "title": "title3",
          "author": "bob",
          "id": "id3",
          "references": [
            "id1",
            "id4",
            "id7",
            "id9"
          ],
          "pubDate": "2018-01-03"
        }
      },
      {
        "_index": "test_stackoverflow_agg",
        "_type": "type1",
        "_id": "id1",
        "_score": 1.0460204,
        "_source": {
          "title": "title1",
          "author": "bob",
          "id": "id1",
          "references": [
            "id1",
            "id2",
            "id3"
          ],
          "pubDate": "2018-01-02"
        }
      }
    ]
  },
  "aggregations": {
    "agg1": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "id1",
          "doc_count": 2
        },
        {
          "key": "id2",
          "doc_count": 1
        },
        {
          "key": "id3",
          "doc_count": 1
        },
        {
          "key": "id4",
          "doc_count": 1
        },
        {
          "key": "id7",
          "doc_count": 1
        },
        {
          "key": "id9",
          "doc_count": 1
        }
      ]
    }
  }
}