基本上,我在这里要做的是从分层存储的字符串中获取二级向下的类别。问题在于层次结构的级别各不相同,一个产品类别可能有六个级别,另一个只有四个级别,否则我只会实现预定义级别。
我有一些类似的产品:
[
{
title: 'product one',
categories: [
'clothing/mens/shoes/boots/steel-toe'
]
},
{
title: 'product two',
categories: [
'clothing/womens/tops/sweaters/open-neck'
]
},
{
title: 'product three',
categories: [
'clothing/kids/shoes/sneakers/light-up'
]
},
{
title: 'product etc.',
categories: [
'clothing/baby/bibs/super-hero'
]
},
... more products
]
我正试图像这样得到聚合桶:
buckets: [
{
key: 'clothing/mens',
...
},
{
key: 'clothing/womens',
...
},
{
key: 'clothing/kids',
...
},
{
key: 'clothing/baby',
...
},
]
我已经尝试查看过滤器前缀,包含和排除条款,但我找不到任何有用的东西。请有人指出我正确的方向。
答案 0 :(得分:2)
应使用自定义分析器分析您的category
字段。也许您对category
有其他一些计划,所以我只添加一个仅用于聚合的子字段:
{
"settings": {
"analysis": {
"filter": {
"category_trimming": {
"type": "pattern_capture",
"preserve_original": false,
"patterns": [
"(^\\w+\/\\w+)"
]
}
},
"analyzer": {
"my_analyzer": {
"tokenizer": "keyword",
"filter": [
"category_trimming",
"lowercase"
]
}
}
}
},
"mappings": {
"test": {
"properties": {
"category": {
"type": "string",
"fields": {
"just_for_aggregations": {
"type": "string",
"analyzer": "my_analyzer"
}
}
}
}
}
}
}
测试数据:
POST /index/test/_bulk
{"index":{}}
{"category": "clothing/womens/tops/sweaters/open-neck"}
{"index":{}}
{"category": "clothing/mens/shoes/boots/steel-toe"}
{"index":{}}
{"category": "clothing/kids/shoes/sneakers/light-up"}
{"index":{}}
{"category": "clothing/baby/bibs/super-hero"}
查询本身:
GET /index/test/_search?search_type=count
{
"aggs": {
"by_category": {
"terms": {
"field": "category.just_for_aggregations",
"size": 10
}
}
}
}
结果:
"aggregations": {
"by_category": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "clothing/baby",
"doc_count": 1
},
{
"key": "clothing/kids",
"doc_count": 1
},
{
"key": "clothing/mens",
"doc_count": 1
},
{
"key": "clothing/womens",
"doc_count": 1
}
]
}
}