Question

我将书籍标题存储在elasticsearch中，它们都属于许多商店。像这样：

{
    "books": [
        {
            "id": 1,
            "title": "Title 1",
            "store": "store1" 
        },
        {             
            "id": 2,
            "title": "Title 1",
            "store": "store2" 
        },
        {             
            "id": 3,
            "title": "Title 1",
            "store": "store3" 
        },
        {             
            "id": 4,
            "title": "Title 2",
            "store": "store2" 
        },
        {             
            "id": 5,
            "title": "Title 2",
            "store": "store3" 
        }
    ]
}

如何获得所有书籍并按标题分组...每组一个结果（一行具有相同标题的组，以便我可以获得所有ID和商店）？

基于上面的数据，我希望得到两个结果，其中包含所有ID和商店。

预期结果：

{
"hits":{
    "total" : 2,
    "hits" : [
        {                
            "0" : {
                "title" : "Title 1",
                "group": [
                     {
                         "id": 1,
                         "store": "store1"
                     },
                     {
                         "id": 2,
                         "store": "store2"
                     },
                     {
                         "id": 3,
                         "store": "store3"
                     },
                ]
            }
        },
        {                
            "1" : {
                "title" : "Title 2",
                "group": [
                     {
                         "id": 4,
                         "store": "store2"
                     },
                     {
                         "id": 5,
                         "store": "store3"
                     }
                ]
            }
        }
    ]
}
}

Answer 1

在Elasticsearch中无法找到您要找的内容，至少在当前版本（1.1）中无法使用。

有一个很长的杰出issue for this feature，其中有很多+ 1和需求。

至于陈述：Simon says，它需要大量的重构，虽然它是有计划的，但没有办法说，什么时候实施甚至发货。

Clinton Gormley in his webinar做出了类似的声明，字段分组需要付出很多努力才能正确完成，特别是因为Elasticsearch本质上是一个分片和分布式环境。如果你不理会分片，那就不是什么大不了的事了，但是Elasticsearch想要只发布功能，这些功能可以扩展到整个系统，并且可以像在一个盒子上一样在数百台机器上运行。 / p>

如果您不依赖于Elasticsearch，Solr offers such a feature。

否则，目前最好的解决方案可能就是做客户端。也就是说，查询一些文档，在您的客户端上进行分组，如果需要，可以获取更多结果以满足您所需的组大小（据我所知，这就是Solr在幕后所做的事情）。

不完全是你想要的，但你也可以选择aggregations;为title创建一个存储桶，并在id字段上完成子聚合。您不会获得store值，但是一旦有了ID，就可以从数据存储中检索它们。

{
    "aggs" : {
        "titles" : {
            "terms" : { "field" : "title" },
            "aggs": {
                "ids": {
                    "terms": { "field" : "id" }
                }
            }
        }
    }
}

修改：看来，使用top_hits aggregations，结果分组很快就会实施。

Answer 2

您可以使用top_hits aggs聚合中的聚合实现上述所需结果。恩。

aggs: {
        "set": {
            "terms": {
                field: "id"
            },
            "aggs": {
                "color": {
                    "terms": {
                        field: "color"
                    },
                    "aggs": {
                        "products": {
                            "top_hits": {
                                _source:{
                                    "include":["size"]
                                }
                            }
                        }
                    }
                },
                "product": {
                    "top_hits": {
                        _source:{
                            "include":["productDetails"]
                        },
                        size: 1
                    }
                }
            }
        }
    }

Answer 3

与SQL＆S; S类似 GROUP BY Elasticsearch提供聚合

使用聚合查询，Elasticsearch会对 Buckets进行回应。

一个桶对应一个类别（组）。

Answer 4

我有同样的问题，但是我发现的最佳解决方案是更改映射。您可以将映射转换为“ store”字段为嵌套类型。这是因为您的关系是多对多的。这样，您可以应用排序，分页。希望对您有所帮助。

如何将结果分组到elasticsearch？

4 个答案: