在弹性搜索中的多个字段中使用多个参数进行搜索

时间:2020-04-14 18:21:14

标签: elasticsearch

如果'Grade'='G6'和Type ='Open'在SAME观众标签中匹配,我只想返回此课程,但必须在SAME标签中存在才能返回此课程。当前,如果找到G6且OPEN的受众不同,则返回此课程,这不是我想要的。 这是不正确的,我返回的数据不正确,我需要查询才能在每个受众中应用,并且仅在同一受众中为真时才返回数据

这是我的json:

{
"took": 1,
"timed_out": false,
"_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
},
"hits": {
    "total": 71,
    "max_score": 3.3118114,
    "hits": [
        {
            "_index": "courses",
            "_type": "course",
            "_id": "LBTBWdzyRw-jgiiYssjv8A",
            "_score": 3.3118114,
            "_source": {
                "id": "LBTBWdzyRw-jgiiYssjv8A",
                "title": "1503 regression testing",
                "shortDescription": "asdf",
                "description": "asdf",
                "learningOutcomes": "",
                "modules": [],
                "learningProvider": {
                    "id": "ig2-zIY_QkSpMC4O0Lm0hw",
                    "name": null,
                    "termsAndConditions": [],
                    "cancellationPolicies": []
                },
                "audiences": [
                    {
                        "id": "VfDpsS_5SXi8iZubzTkUBQ",
                        "name": "comm",
                        "areasOfWork": [
                            "Communications"
                        ],
                        "departments": [],
                        "grades": [
                            "G6"
                        ],
                        "interests": [],
                        "requiredBy": null,
                        "frequency": null,
                        "type": "OPEN",
                        "eventId": null
                    },
                    {
                        "id": "eZPPPqTqRdiDAE3xCPlJMQ",
                        "name": "analysis",
                        "areasOfWork": [
                            "Analysis"
                        ],
                        "departments": [],
                        "grades": [
                            "G6"
                        ],
                        "interests": [],
                        "requiredBy": null,
                        "frequency": null,
                        "type": "REQUIRED",
                        "eventId": null
                    }
                ],
                "preparation": "",
                "owner": {
                    "scope": "LOCAL",
                    "organisationalUnit": "co",
                    "profession": 63,
                    "supplier": ""
                },
                "visibility": "PUBLIC",
                "status": "Published",
                "topicId": ""
            }
        }
    ]
}

}

我的ES代码:

 BoolQueryBuilder boolQuery = boolQuery();

    boolQuery.should(QueryBuilders.matchQuery("audiences.departments.keyword", department));
    boolQuery.should(QueryBuilders.matchQuery("audiences.areasOfWork.keyword", areaOfWork));
    boolQuery.should(QueryBuilders.matchQuery("audiences.interests.keyword", interest));

    BoolQueryBuilder filterQuery = boolQuery();
    filterQuery.must(QueryBuilders.matchQuery("audiences.grades.keyword", "G6"));
    filterQuery.must(QueryBuilders.matchQuery("audiences.type", "OPEN"));

这里是索引映射:

{
  "media": {
    "aliases": {}
  },
  "courses": {
    "aliases": {}
  },
  "feedback": {
    "aliases": {}
  },
  "learning-providers": {
    "aliases": {}
  },
  "resources": {
    "aliases": {}
  },
  "courses-0.4.0": {
    "aliases": {}
  },
  ".security-6": {
    "aliases": {
      ".security": {}
    }
  },
  "payments": {
    "aliases": {}
  }
}

1 个答案:

答案 0 :(得分:1)

由于要使用query to apply in each audience and only return data if it is true in the same audience,因此需要为audiences字段指定嵌套的数据类型,否则ElasticSearch会以Objects的形式存储它,并且它没有嵌套对象的概念,因为Elasticsearch将对象层次结构扁平化为字段名称和值的简单列表。有关更多详细信息,您可以参考此https://www.elastic.co/guide/en/elasticsearch/reference/current/nested.html

以您的示例为例,这是您的文档:

    "audiences": [
            {
                "id": "1",
                "field": "comm"
            },
           {
                "id": "2",
                "field": "arts"
           }
   ]

Elasticsearch以以下形式变平:

{
   "audiences.id":[1,2],
   "audiences.field":[comm,arts]
}

现在,如果您在搜索查询中说受众必须具有id:1和field:艺术,那么上面的文档也将被匹配。

因此,为了避免此类对象,应将其定义为nested对象。 ElasticSearch会将每个对象分开存储,而不是将其展平,因此将分别搜索每个对象。

上述文件的映射应为:

映射

{
    "mappings": {
        "properties": {
            "shortDescription": {
                "type": "text"
            },
            "audiences": {
                "type": "nested"
            },
            "description": {
                "type": "text"
            },
            "modules": {
                "type": "text"
            },
            "preparation": {
                "type": "text"
            },
            "owner": {
                "properties": {
                    "scope": {
                        "type": "text"
                    },
                    "organisationalUnit": {
                        "type": "text"
                    },
                    "profession": {
                        "type": "text"
                    },
                    "supplier": {
                        "type": "text"
                    }
                }
            },
            "learningProvider": {
                "properties": {
                    "id": {
                        "type": "text"
                    },
                    "name": {
                        "type": "text"
                    },
                    "termsAndConditions": {
                        "type": "text"
                    },
                    "cancellationPolicies": {
                        "type": "text"
                    }
                }
            },
            "visibility": {
                "type": "text"
            },
            "status": {
                "type": "text"
            },
            "topicId": {
                "type": "text"
            }
        }
    }
}

现在,如果我们将此文档编入索引:

文档

{
    "shortDescription": "asdf",
    "description": "asdf",
    "learningOutcomes": "",
    "modules": [],
    "learningProvider": {
        "id": "ig2-zIY_QkSpMC4O0Lm0hw",
        "name": null,
        "termsAndConditions": [],
        "cancellationPolicies": []
    },
    "audiences": [
        {
            "id": "VfDpsS_5SXi8iZubzTkUBQ",
            "name": "comm",
            "areasOfWork": [
                "Communications"
            ],
            "departments": [],
            "grades": [
                "G6"
            ],
            "interests": [],
            "requiredBy": null,
            "frequency": null,
            "type": "OPEN",
            "eventId": null
        },
        {
            "id": "eZPPPqTqRdiDAE3xCPlJMQ",
            "name": "analysis",
            "areasOfWork": [
                "Analysis"
            ],
            "departments": [],
            "grades": [
                "G7"
            ],
            "interests": [],
            "requiredBy": null,
            "frequency": null,
            "type": "REQUIRED",
            "eventId": null
        }
    ],
    "preparation": "",
    "owner": {
        "scope": "LOCAL",
        "organisationalUnit": "co",
        "profession": 63,
        "supplier": ""
    },
    "visibility": "PUBLIC",
    "status": "Published",
    "topicId": ""
}

如果您搜索的查询是这样的:

搜索查询1

{
"query": {
    "nested": {
        "path": "audiences",
        
        "query": {
            "bool": {
                "must": [
                    {
                        "match": {
                            "audiences.type.keyword": "OPEN"
                        }
                        
                    },
                     {
                        "match": {
                            "audiences.grades.keyword": "G6"
                        }
                        
                    }
                ]
            }
        }
       
    }
}

}

结果

"hits": [
        {
            "_index": "product",
            "_type": "_doc",
            "_id": "1",
            "_score": 0.9343092,
            "_source": {
                "shortDescription": "asdf",
                "description": "asdf",
                "learningOutcomes": "",
                "modules": [],
                "learningProvider": {
                    "id": "ig2-zIY_QkSpMC4O0Lm0hw",
                    "name": null,
                    "termsAndConditions": [],
                    "cancellationPolicies": []
                },
                "audiences": [
                    {
                        "id": "VfDpsS_5SXi8iZubzTkUBQ",
                        "name": "comm",
                        "areasOfWork": [
                            "Communications"
                        ],
                        "departments": [],
                        "grades": [
                            "G6"
                        ],
                        "interests": [],
                        "requiredBy": null,
                        "frequency": null,
                        "type": "OPEN",
                        "eventId": null
                    },
                    {
                        "id": "eZPPPqTqRdiDAE3xCPlJMQ",
                        "name": "analysis",
                        "areasOfWork": [
                            "Analysis"
                        ],
                        "departments": [],
                        "grades": [
                            "G7"
                        ],
                        "interests": [],
                        "requiredBy": null,
                        "frequency": null,
                        "type": "REQUIRED",
                        "eventId": null
                    }
                ],
                "preparation": "",
                "owner": {
                    "scope": "LOCAL",
                    "organisationalUnit": "co",
                    "profession": 63,
                    "supplier": ""
                },
                "visibility": "PUBLIC",
                "status": "Published",
                "topicId": ""
            }
        }
    ]

但是现在,如果您的搜索查询是:

搜索查询2:

{
    "query": {
        "nested": {
            "path": "audiences",
            
            "query": {
                "bool": {
                    "must": [
                        {
                            "match": {
                                "audiences.type.keyword": "OPEN"
                            }
                            
                        },
                         {
                            "match": {
                                "audiences.grades.keyword": "G7"
                            }
                            
                        }
                    ]
                }
            }
           
        }
    }
}

结果:

"hits": {
        "total": {
            "value": 0,
            "relation": "eq"
        },
        "max_score": null,
        "hits": []
}

因此,简而言之,您需要在映射和其余查询中更改audiences字段的数据类型,以便它可以搜索嵌套的数据类型。

因此,代替此代码片段:

BoolQueryBuilder filterQuery = boolQuery();
filterQuery.must(QueryBuilders.matchQuery("audiences.grades.keyword", "G6"));
filterQuery.must(QueryBuilders.matchQuery("audiences.type", "OPEN"));
        

您应该使用此嵌套查询:

BoolQueryBuilder filterQuery = new BoolQueryBuilder();
filterQuery.must(QueryBuilders.matchQuery("audiences.grades.keyword", "G6"));
filterQuery.must(QueryBuilders.matchQuery("audiences.type", "OPEN"));
NestedQueryBuilder nested = new NestedQueryBuilder("audiences", filterQuery, ScoreMode.None);