我有以下弹性搜索索引的文件
{
"_index": "ecommerce",
"_type": "products",
"_id": "12895",
"_score": 1,
"_source": {
"title": "Blue Armani Jeans",
"slug": "blue-armani-jeans",
"price": 200,
"sale_price": 0,
"vendor_id": 62,
"featured": 0,
"viewed": 0,
"stock": 1,
"sku": "arm-jeans",
"brand": "",
"rating": 0,
"active": 0,
"vendor_name": "Armani",
"category": [
"Men Fashion",
"Casual Wear"
],
"image": "armani-jeans.jpg",
"variations": [
{
"variation_id": "32",
"stock": 10,
"price": 199,
"variation_image": "",
"sku": "arm-jeans-11",
"Size": "38",
"Color": "Blue"
},
{
"variation_id": "33",
"stock": 10,
"price": 199,
"variation_image": "",
"sku": "arm-jeans-12",
"Size": "40",
"Color": "Blue"
}
]
}
},
我正在使用一个查询,它可以通过聚合显示所有过滤器变体。
查询:
{
"size": 0,
"aggs": {
"variations": {
"nested": {
"path": "variations"
},
"aggs": {
"size": {
"terms": {
"field": "variations.Size"
}
},
"color": {
"terms": {
"field": "variations.Color"
}
},
"brand": {
"reverse_nested": {},
"aggs": {
"brand": {
"value_count": {
"field": "brand"
}
}
}
}
}
}
}
}
输出
"color": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 543,
"buckets": [
{
"key": "black",
"doc_count": 298
},
{
"key": "blue",
"doc_count": 227
},
{
"key": "brown",
"doc_count": 170
},
{
"key": "white",
"doc_count": 153
},
{
"key": "pink",
"doc_count": 127
},
{
"key": "grey",
"doc_count": 120
},
{
"key": "multi",
"doc_count": 99
},
{
"key": "red",
"doc_count": 89
},
{
"key": "color",
"doc_count": 81
},
{
"key": "green",
"doc_count": 76
}
]
},
"brand": {
"doc_count": 621,
"brand": {
"value": 6
}
},
"size": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 517,
"buckets": [
{
"key": "size",
"doc_count": 195
},
{
"key": "s",
"doc_count": 158
},
{
"key": "free",
"doc_count": 156
},
{
"key": "m",
"doc_count": 140
},
{
"key": "l",
"doc_count": 134
},
{
"key": "xl",
"doc_count": 102
},
{
"key": "9",
"doc_count": 69
},
{
"key": "8",
"doc_count": 68
},
{
"key": "10",
"doc_count": 67
},
{
"key": "11",
"doc_count": 61
}
]
}
如果他们没有任何空格,那么这些记录很好,但是对于像“免费尺寸”这样的变化,它会将它们分成“自由”和“大小”。
如何将它们视为单一变异参数?或者是否有针对这种情况的专门查询?
答案 0 :(得分:0)
问题是您的映射很可能是这样的:
...
"variations": {
"properties": {
"Size": {
"type": "text",
"analyzer": "standard"
...
这有点过于简化了,但是当Elasticsearch对文档进行索引时,它首先对它们进行分析,然后将它们拆分为标记,并修改标记以使它们最适合搜索,然后将索引存储在索引中每个令牌中的许多都出现在每个文档中。例如,如果你有一个文字说“狗很棒”,而有人搜索“狗”,你想要匹配那个文字,因为它是关于狗的。 Elasticsearch具有超强的功能,可用于各种用途,但其首要目的是自然语言文本搜索。所以默认情况下,这就是它所准备的。如果您想要其他行为(使用mapping),您需要明确告诉它。
当你进行术语聚合时,遍历每个文档的原始文本是非常低效的,而不是仅仅使用已经创建的索引,这些索引方便地包含每个文档的术语计数。如果您有标准的分析文本,则“条款”在这种情况下表示“免费”和“大小”,不“免费大小”。如果要将整个字段编入索引作为术语,可以使用“关键字”类型而不是“文本”类型:
...
"variations": {
"properties": {
"Size": {
"type": "keyword"
...
如果您没有明确设置此字段的任何映射,则ES 5中的默认值实际上已经有keyword
- 映射字段:
...
"variations": {
"properties": {
"Size": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
...
这意味着如果您没有超过256个字符的任何大小值,您只需将聚合更新为如下所示:
...
"aggs": {
"size": {
"terms": {
"field": "variations.Size.keyword"
}
},
...
但是,除非您实际使用的是分析字段,否则我建议您使用映射Size
的{{1}}字段重新编制索引。