Question

我想在存储桶中对Elasticsearch结果进行细分，以便在结果中将类似文档（具有大多数匹配项）组合在一起（在分析的字段上）。我不知道如何以这种方式聚合各个文档的桶。

这是基本的映射：

PUT movies
{
  "mappings": {
    "movie": { 
      "properties": { 
        "id":    { "type": "long" }, 
        "title": { "type" : "text" }
      }
    }
  }
}

现在，例如，如果对hunger进行了查询，那么结果应该被分组为具有大多数相似术语的匹配文档的桶：

{
    "buckets": {
        "1": [
            {
                "title": "The Hunger Games"
            },
            {
                "title": "The Hunger Games: Mockingjay"
            },
            {
                "title": "The Hunger Games: Catching Fire"
            }
        ],
        "2": [
            {
                "title": "Aqua Teen Hunger Force"
            },
            {
                "title": "Force of Hunger"
            }
        ],
        "3": [
            {
                "title": "Hunger Pain"
            }
        ],
        :
        :
        :
    }
}

在上面的示例中，基于至少两个匹配的术语，类似的文档被分组在单独的桶中。所有没有类似术语的匹配标题仍作为单独的桶包含在结果中（例如桶＃3）。

任何建议都表示赞赏。

按术语频率分组的文档桶

0 个答案: