我在elasticsearch中有一个索引,其中body包含一个带有数组值的字段数组。例如:
{
"took": 0,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 3,
"max_score": 1,
"hits": [
{
"_index": "families",
"_type": "family",
"_id": "o8qxd2EB9CizMt-k15mv",
"_score": 1,
"_source": {
"names": [
"Jefferson Erickson",
"Bailee Miller",
"Ahmed Bray"
]
}
},
{
"_index": "families",
"_type": "family",
"_id": "osqxd2EB9CizMt-kfZlJ",
"_score": 1,
"_source": {
"names": [
"Nia Walsh",
"Jefferson Erickson",
"Darryl Stark"
]
}
},
{
"_index": "families",
"_type": "family",
"_id": "pMrEd2EB9CizMt-kq5m-",
"_score": 1,
"_source": {
"names": [
"lia shelton",
"joanna shaffer",
"mathias little"
]
}
}
]
}
}
现在我需要一个搜索查询,我可以在其中搜索值数组中的文档,如下所示:
GET /families/_search
{
"query" : {
"bool" : {
"filter" : {
"bool" : {
"should" : [
{"match_phrase" : {"names" : ["ahmed bray", "nia walsh"]}}
]
}
}
}
}
}
它应该返回包含这些名称的2个文档,如下所示:
{
"took": 0,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 0,
"hits": [
{
"_index": "families",
"_type": "family",
"_id": "o8qxd2EB9CizMt-k15mv",
"_score": 0,
"_source": {
"names": [
"Jefferson Erickson",
"Bailee Miller",
"Ahmed Bray"
]
}
},
{
"_index": "families",
"_type": "family",
"_id": "osqxd2EB9CizMt-kfZlJ",
"_score": 0,
"_source": {
"names": [
"Nia Walsh",
"Jefferson Erickson",
"Darryl Stark"
]
}
}
]
}
}
我如何进行这样的查询?我尝试使用“terms”关键字,但“terms”只允许我从一个数组中搜索单个单词,如下所示: {“terms”:{“names”:[“bray”,“nia”]}}
但我需要使用这样的全名: {“names”:[“ahmed bray”,“nia walsh”]}}
答案 0 :(得分:0)
您遇到的“问题”与Elasticsearch如何处理文本字段的行为有关。默认情况下,每个文本字段都使用Standard Tokenizer进行标记,正如您在文档中看到的那样,可以在单词上分割文本。
实现此目的的一个选项是改进默认设置和映射。您需要做的就是在我们的案例中添加multi field(entire-phrase
),这将以不同的方式进行分析并进行搜索。
首先使用以下设置/映射创建索引:
{
"settings": {
"analysis": {
"normalizer": {
"case_and_accent_insensitive": {
"filter": [
"lowercase",
"asciifolding"
]
}
}
}
},
"mappings": {
"family": {
"properties": {
"names": {
"type": "text",
"fields": {
"entire-phrase": {
"type": "keyword",
"normalizer": "case_and_accent_insensitive"
}
}
}
}
}
}
}
然后你可以搜索你的期望:
{
"query": {
"terms": {
"names.entire-phrase": [
"ahmed bray",
"nia walsh"
]
}
}
}
必须提醒您,此搜索只能通过名字或姓氏找到任何结果。只匹配整个短语。如果您想同时实现这两项,则必须同时按names
和names.entire-phrase
字段进行搜索。