我想搜索与对象数组无关的不同字段。我无法了解如何。
给出以下映射和数据输入:我想让用户能够以任意组合搜索所有可能的字段。用户将使用带有关键字输入的表单,排除关键字输入,日期范围和多选下拉列表。这个查询是什么样的?我在数据条目下面包含了几个失败的查询和过滤器。
映射
{
"plants" : {
"properties" : {
"name" : {"type" : "string"},
"description" : {"type" : "string"},
"planting" : {"type" : "string"},
"maintenance" : {"type" : "string"},
"type" : {"type" : "integer"},
"petals" : {
"properties" : {
"color" : {"type" : "string"}
}
},
"species" : {
"properties" : {
"name" : {"type" : "string"},
"subspecies" : {
"properties" : {
"name" : {"type" : "string"}
}
}
}
},
"pests" : {
"properties" : {
"pest" : {"type" : "string"}
}
},
"diseases" : {
"properties" : {
"disease" : {"type" : "string"}
}
}
}
}
}
数据输入:Rose
{
"name" : "Rose",
"description" : "A few paragraphs of text",
"planting" : "A few paragraphs of text",
"maintenance" : "A few paragraphs of text",
"type" : "Perennial",
"petals" : [
{"color" : "Red"},
{"color" : "White"},
{"color" : "Yellow"},
{"color" : "Pink"},
{"color" : "Orange"},
{"color" : "Purple"}
],
"species" : [
{
"name" : "Hulthemia",
"description" : "A few paragraphs of text",
"subspecies" : []
},
{
"name" : "Hesperrhodos",
"description" : "A few paragraphs of text",
"subspecies" : []
},
{
"name" : "Platyrhodon",
"description" : "A few paragraphs of text",
"subspecies" : []
},
{
"name" : "Rosa",
"description" : "A few paragraphs of text",
"subspecies" : [
{"name" : "Banksianae"},
{"name" : "Bracteatae"},
{"name" : "Caninae"},
{"name" : "Carolinae"},
{"name" : "Chinensis"},
{"name" : "Gallicanae"},
{"name" : "Gymnocarpae"},
{"name" : "Laevigatae"},
{"name" : "Pimpinellifoliae"},
{"name" : "Cinnamomeae"},
{"name" : "Synstylae"}
]
}
],
"pests" : [],
"diseases" : []
}
查询
例如,我在以下查询中取得了成功,但对于100k到10M数据条目(不是鲜花和许多字段)的大型数据集,它并不准确。我正在搜索具有多个精确值匹配的多个字段,同时希望每个条目具有相关性分数。当我想要花“petal.color”是“紫色”,“粉红色”和/或“白色”以及搜索另外两个像“花瓣”这样的列表的字段时,“minimum_should_match”的选项没有意义。像“类型”这样的字符串。我可以将“minimum_should_match”设置为等于2,但是带有多个“petal.color”的花将满足该要求,我将获得不是“常年”或“年度”的“类型”,例如“双年展”< / strong>即可。我查看过滤器并将其作为我的下一个示例。
{
"query" : {
"bool" : {
"must" : [
{
"multi_match":{
"query":"disease resistant",
"type":"cross_fields",
"fields":[
"description",
"planting",
"maintenance",
"name"
],
"tie_breaker":0.3
}
}
],
"must_not" : [
{
"multi_match":{
"query":"lavender",
"type":"cross_fields",
"fields":[
"description",
"planting",
"maintenance",
"name"
],
"tie_breaker":0.3
}
}
],
"should" : [
{"match" : {"type" : "Perennial"}},
{"match" : {"type" : "Annual"}},
{"match" : {"petals.color" : "purple"}},
{"match" : {"petals.color" : "pink"}},
{"match" : {"petals.color" : "white"}}
]
}
}
}
使用条款查询
以下是尝试使用“条款”。我不确定为什么它不起作用。
{
"query" : {
"bool" : {
"must" : [
{
"multi_match":{
"query":"disease resistant",
"type":"cross_fields",
"fields":[
"description",
"planting",
"maintenance",
"name"
],
"tie_breaker":0.3
}
},
{
"terms" : {
"type" : ["Perennial","Annual"],
"minimum_should_match" : 1
}
},
{
"terms" : {
"petals.color" : ["purple","pink","white"],
"minimum_should_match" : 1
}
}
],
"must_not" : [
{
"multi_match":{
"query":"lavender",
"type":"cross_fields",
"fields":[
"description",
"planting",
"maintenance",
"name"
],
"tie_breaker":0.3
}
}
],
"should" : [
]
}
}
}
使用查询/过滤器查询
以下是尝试将查询和过滤器组合使用混合和/或过滤器。 我觉得问题出现在“或”“花瓣颜色”中,其中“花瓣颜色”是一个颜色列表,而不是一个确切的值。
另一个选项是花瓣的排列列表。颜色解决“或”问题(即紫色+粉红色+白色,紫色+粉红色,紫色+白色,粉红色+白色,紫色,粉红色,白色。)这将得到在列表中详尽无遗,可以有数百个可能的值,并且您正在搜索它们的子集。例如国家列表和您匹配的特定大陆国家/地区。
另一个选项是“花瓣颜色”的反向选择,并放入“bool”“must_not”。这比排列列表的工作少,因为elasticsearch支持聚合。
{
"query" : {
"filtered" : {
"query" : {
"bool" : {
"must" : [
{
"multi_match":{
"query":"disease resistant",
"type":"cross_fields",
"fields":[
"description",
"planting",
"maintenance",
"name"
],
"tie_breaker":0.3
}
}
],
"must_not" : [
{
"multi_match":{
"query":"lavender",
"type":"cross_fields",
"fields":[
"description",
"planting",
"maintenance",
"name"
],
"tie_breaker":0.3
}
}
],
"should" : [
]
}
},
"filter" : {
"and" : [
{
"or" : [
{"match" : {"type" : "Perennial"}},
{"match" : {"type" : "Annual"}}
]
},
{
"or" : [
{"match" : {"petals.color" : "purple"}},
{"match" : {"petals.color" : "pink"}},
{"match" : {"petals.color" : "white"}}
]
}
]
}
}
}
}
答案 0 :(得分:1)
嵌套[bool] [必须] [bool] [应该]将“minimum_should_match”仅隔离到正在搜索的列表(对象数组)。请参阅以下示例。
{
"query" : {
"bool" : {
"must" : [
{
"multi_match":{
"query":"disease resistant",
"type":"cross_fields",
"fields":[
"description",
"planting",
"maintenance",
"name"
],
"tie_breaker":0.3
}
},
"bool" : {
"should" : [
{"match" : {"type" : "Perennial"}},
{"match" : {"type" : "Annual"}}
],
"minimum_should_match" : 1
},
"bool" : {
"should" : [
{"match" : {"petals.color" : "purple"}},
{"match" : {"petals.color" : "pink"}},
{"match" : {"petals.color" : "white"}}
],
"minimum_should_match" : 1
}
],
"must_not" : [
{
"multi_match":{
"query":"lavender",
"type":"cross_fields",
"fields":[
"description",
"planting",
"maintenance",
"name"
],
"tie_breaker":0.3
}
}
]
}
}
}