我正在使用elasticsearch并需要实现分层对象的方面搜索,如下所示:
所以我需要获得两个相关对象的方面。文档说,有可能为数值获得这样的方面,但我需要它用于字符串http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-facets-terms-stats-facet.html
这是另一个有趣的话题,不幸的是它已经过时了:http://elasticsearch-users.115913.n3.nabble.com/Pivot-facets-td2981519.html
弹性搜索有可能吗? 如果是这样,我该怎么做?
答案 0 :(得分:5)
之前的解决方案非常有效,直到您在单个文档上只有一个多级标记。在这种情况下,简单聚合不起作用,因为lucene字段的扁平结构将结果混合在内部聚合上。 请参阅以下示例:
DELETE /test_category
POST /test_category
# Insert a doc with 2 hierarchical tags
POST /test_category/test/1
{
"categories": [
{
"cat_1": "1",
"cat_2": "1.1"
},
{
"cat_1": "2",
"cat_2": "2.2"
}
]
}
# Simple two-levels aggregations query
GET /test_category/test/_search?search_type=count
{
"aggs": {
"main_category": {
"terms": {
"field": "categories.cat_1"
},
"aggs": {
"sub_category": {
"terms": {
"field": "categories.cat_2"
}
}
}
}
}
}
这是我在ES 1.4上得到的错误响应,其中内部聚合上的字段在文档级别混合:
{
...
"aggregations": {
"main_category": {
"buckets": [
{
"key": "1",
"doc_count": 1,
"sub_category": {
"buckets": [
{
"key": "1.1",
"doc_count": 1
},
{
"key": "2.2", <= WRONG
"doc_count": 1
}
]
}
},
{
"key": "2",
"doc_count": 1,
"sub_category": {
"buckets": [
{
"key": "1.1", <= WRONG
"doc_count": 1
},
{
"key": "2.2",
"doc_count": 1
}
]
}
}
]
}
}
}
解决方案可以是使用嵌套对象。这些是要做的步骤:
1)在具有嵌套对象的模式中定义新类型
POST /test_category/test2/_mapping
{
"test2": {
"properties": {
"categories": {
"type": "nested",
"properties": {
"cat_1": {
"type": "string"
},
"cat_2": {
"type": "string"
}
}
}
}
}
}
# Insert a single document
POST /test_category/test2/1
{"categories":[{"cat_1":"1","cat_2":"1.1"},{"cat_1":"2","cat_2":"2.2"}]}
2)运行嵌套聚合查询:
GET /test_category/test2/_search?search_type=count
{
"aggs": {
"categories": {
"nested": {
"path": "categories"
},
"aggs": {
"main_category": {
"terms": {
"field": "categories.cat_1"
},
"aggs": {
"sub_category": {
"terms": {
"field": "categories.cat_2"
}
}
}
}
}
}
}
}
这是我的反应,现在是正确的:
{
...
"aggregations": {
"categories": {
"doc_count": 2,
"main_category": {
"buckets": [
{
"key": "1",
"doc_count": 1,
"sub_category": {
"buckets": [
{
"key": "1.1",
"doc_count": 1
}
]
}
},
{
"key": "2",
"doc_count": 1,
"sub_category": {
"buckets": [
{
"key": "2.2",
"doc_count": 1
}
]
}
}
]
}
}
}
}
相同的解决方案可以扩展到两个以上的层次结构方面。
答案 1 :(得分:3)
目前,elasticsearch不支持开箱即用的分层构面。但即将发布的1.0版本具有一个新的aggregations模块,可用于获取这些类型(更像是数据透视面而不是分层面)。版本1.0目前处于测试阶段,您可以download the second beta自行测试聚合素。您的示例可能看起来像
curl -XPOST 'localhost:9200/_search?pretty' -d '
{
"aggregations": {
"main category": {
"terms": {
"field": "cat_1",
"order": {"_term": "asc"}
},
"aggregations": {
"sub category": {
"terms": {
"field": "cat_2",
"order": {"_term": "asc"}
}
}
}
}
}
}'
我们的想法是,为每个级别的构面设置不同的字段,并根据第一级(cat_1
)的条款对您的构面进行分析。根据第二级(cat_2
)的条款,这些聚合将具有子桶。结果可能看起来像
{
"aggregations" : {
"main category" : {
"buckets" : [ {
"key" : "category 1",
"doc_count" : 10,
"sub category" : {
"buckets" : [ {
"key" : "subcategory 1",
"doc_count" : 4
}, {
"key" : "subcategory 2",
"doc_count" : 6
} ]
}
}, {
"key" : "category 2",
"doc_count" : 7,
"sub category" : {
"buckets" : [ {
"key" : "subcategory 1",
"doc_count" : 3
}, {
"key" : "subcategory 2",
"doc_count" : 4
} ]
}
} ]
}
}
}