Elasticsearch针对分面搜索的正确聚合

时间:2015-11-15 18:34:14

标签: search elasticsearch facet faceted-search

我想在某个平台上提供针对服装产品的分面搜索。由于我已经拥有基于Elasticsearch的搜索功能(简单查询,只有产品名称),因此最好还是使用ES实现分面搜索。

这应该通过聚合来完成,因为不推荐使用facet,并且还可以使用嵌套聚合。

但是,我无法围绕数以百万计的聚合包围,哪些聚合适合我 - 有termsfilterfiltersnested,{ {3}}等等。所有这些似乎都合适。

我想要达到的目标听起来非常基本:我有不同的方面(品牌,条件,颜色),每个方面都有不同的价值。对于某些方面(品牌),用户只能选择一个值。对于其他人(颜色),允许用户选择最多3个(因为一些衣服有多种颜色)。

我从多字段术语开始。现在,下一个自然步骤是将其转换为术语聚合(上述原因),但聚合术语不支持多字段。

{
    "query" : {
        "match_all" : {  }
    },
    "facets" : {
        "groupByBrandAndCondition" : {
            "terms" : {
                "fields" : ["brand", "condition"],
                "size" : 10
            }
        }
    }
}

我在某种程度上错过了一些关于如何进行并行多级分组的简单但关键的观点。用户界面来说,用户应该可以选择以下内容:

  • 品牌(10)
    • A(7)
    • B(3) [X]
  • 颜色(5)
    • 蓝色(3) [X]
    • 红色(2) [X]

阅读:选择A(7),蓝色(3)红色(2)

1 个答案:

答案 0 :(得分:2)

我创建了像这样的基本映射

POST your_index/your_type/_mapping
{
  "your_type": {
    "properties": {
      "product": {
        "type": "string"
      },
      "brand": {
        "type": "string"
      },
      "color": {
        "type": "string"
      }
    }
  }
}

我插入了一些像这样的文件

PUT your_index/your_type/111
{
  "product" : "jeans" ,"brand" : "lee", "color" : "blue"
}

PUT your_index/your_type/1111
{
  "product" : "shoes" ,"brand" : "levi", "color" : "black"
}

And so on

像这样的简单聚合查询

GET your_index/_search
{
  "size": 0,
  "aggs": {
    "prod_agg": {
      "terms": {
        "field": "product"
      },
      "aggs": {
        "brand_agg": {
          "terms": {
            "field": "brand"
          },
          "aggs": {
            "color_agg": {
              "terms": {
                "field": "color"
              }
            }
          }
        }
      }
    }
  }
}

将退回

"aggregations": {
      "prod_agg": {
         "doc_count_error_upper_bound": 0,
         "sum_other_doc_count": 0,
         "buckets": [
            {
               "key": "shoes",
               "doc_count": 4,
               "brand_agg": {
                  "doc_count_error_upper_bound": 0,
                  "sum_other_doc_count": 0,
                  "buckets": [
                     {
                        "key": "nike",
                        "doc_count": 3,
                        "color_agg": {
                           "doc_count_error_upper_bound": 0,
                           "sum_other_doc_count": 0,
                           "buckets": [
                              {
                                 "key": "blue",
                                 "doc_count": 2
                              },
                              {
                                 "key": "black",
                                 "doc_count": 1
                              }
                           ]
                        }
                     },
                     {
                        "key": "levi",
                        "doc_count": 1,
                        "color_agg": {
                           "doc_count_error_upper_bound": 0,
                           "sum_other_doc_count": 0,
                           "buckets": [
                              {
                                 "key": "black",
                                 "doc_count": 1
                              }
                           ]
                        }
                     }
                  ]
               }
            },
            {
               "key": "jeans",
               "doc_count": 3,
               "brand_agg": {
                  "doc_count_error_upper_bound": 0,
                  "sum_other_doc_count": 0,
                  "buckets": [
                     {
                        "key": "lee",
                        "doc_count": 2,
                        "color_agg": {
                           "doc_count_error_upper_bound": 0,
                           "sum_other_doc_count": 0,
                           "buckets": [
                              {
                                 "key": "black",
                                 "doc_count": 1
                              },
                              {
                                 "key": "blue",
                                 "doc_count": 1
                              }
                           ]
                        }
                     },
                     {
                        "key": "levi",
                        "doc_count": 1,
                        "color_agg": {
                           "doc_count_error_upper_bound": 0,
                           "sum_other_doc_count": 0,
                           "buckets": [
                              {
                                 "key": "black",
                                 "doc_count": 1
                              }
                           ]
                        }
                     }
                  ]
               }
            }
         ]
      }
   }

这可用于填充UI搜索条件。

然后如果用户想要搜索鞋子,您可以查询

GET your_index/_search
{
  "size": 0,
  "query": {
    "match": {
      "product": "shoes"
    }
  }, 
  "aggs": {
    "brand_agg": {
      "terms": {
        "field": "brand"
      },
      "aggs": {
        "color_agg": {
          "terms": {
            "field": "color"
          }
        }
      }
    }
  }
}

会给你

"aggregations": {
      "brand_agg": {
         "doc_count_error_upper_bound": 0,
         "sum_other_doc_count": 0,
         "buckets": [
            {
               "key": "nike",
               "doc_count": 3,
               "color_agg": {
                  "doc_count_error_upper_bound": 0,
                  "sum_other_doc_count": 0,
                  "buckets": [
                     {
                        "key": "blue",
                        "doc_count": 2
                     },
                     {
                        "key": "black",
                        "doc_count": 1
                     }
                  ]
               }
            },
            {
               "key": "levi",
               "doc_count": 1,
               "color_agg": {
                  "doc_count_error_upper_bound": 0,
                  "sum_other_doc_count": 0,
                  "buckets": [
                     {
                        "key": "black",
                        "doc_count": 1
                     }
                  ]
               }
            }
         ]
      }
   }

或者您可以将它们作为具有查询的单独存储桶,例如

GET your_index/_search
{
  "size": 0,
  "query": {
    "match": {
      "product": "shoes"
    }
  },
  "aggs": {
    "brand_agg": {
      "terms": {
        "field": "brand"
      }
    },
    "color_agg" : {
      "terms": {
        "field": "color"
      }
    }
  }
}

会给你

"aggregations": {
      "color_agg": {
         "doc_count_error_upper_bound": 0,
         "sum_other_doc_count": 0,
         "buckets": [
            {
               "key": "black",
               "doc_count": 2
            },
            {
               "key": "blue",
               "doc_count": 2
            }
         ]
      },
      "brand_agg": {
         "doc_count_error_upper_bound": 0,
         "sum_other_doc_count": 0,
         "buckets": [
            {
               "key": "nike",
               "doc_count": 3
            },
            {
               "key": "levi",
               "doc_count": 1
            }
         ]
      }
   }

使用doc_count值告诉用户他们有多少选项。

这是否满足您的要求?