搜索值数组

时间:2018-02-08 23:39:18

标签: elasticsearch elasticsearch-5

我在elasticsearch中有一个索引,其中body包含一个带有数组值的字段数组。例如:

{
"took": 0,
"timed_out": false,
"_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
},
"hits": {
    "total": 3,
    "max_score": 1,
    "hits": [
    {
        "_index": "families",
        "_type": "family",
        "_id": "o8qxd2EB9CizMt-k15mv",
        "_score": 1,
        "_source": {
        "names": [
            "Jefferson Erickson",
            "Bailee Miller",
            "Ahmed Bray"
        ]
        }
    },
    {
        "_index": "families",
        "_type": "family",
        "_id": "osqxd2EB9CizMt-kfZlJ",
        "_score": 1,
        "_source": {
        "names": [
            "Nia Walsh",
            "Jefferson Erickson",
            "Darryl Stark"
        ]
        }
    },
    {
        "_index": "families",
        "_type": "family",
        "_id": "pMrEd2EB9CizMt-kq5m-",
        "_score": 1,
        "_source": {
        "names": [
            "lia shelton",
            "joanna shaffer",
            "mathias little"
        ]
        }
    }
    ]
}
}

现在我需要一个搜索查询,我可以在其中搜索值数组中的文档,如下所示:

GET /families/_search
{
"query" : {
    "bool" : {
    "filter" : {
        "bool" : {
        "should" : [
            {"match_phrase" : {"names" : ["ahmed bray", "nia walsh"]}}
        ]
        }
    }
    }
}
}

它应该返回包含这些名称的2个文档,如下所示:

{
"took": 0,
"timed_out": false,
"_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
},
"hits": {
    "total": 2,
    "max_score": 0,
    "hits": [
    {
        "_index": "families",
        "_type": "family",
        "_id": "o8qxd2EB9CizMt-k15mv",
        "_score": 0,
        "_source": {
        "names": [
            "Jefferson Erickson",
            "Bailee Miller",
            "Ahmed Bray"
        ]
        }
    },
    {
        "_index": "families",
        "_type": "family",
        "_id": "osqxd2EB9CizMt-kfZlJ",
        "_score": 0,
        "_source": {
        "names": [
            "Nia Walsh",
            "Jefferson Erickson",
            "Darryl Stark"
        ]
        }
    }
    ]
}
}

我如何进行这样的查询?我尝试使用“terms”关键字,但“terms”只允许我从一个数组中搜索单个单词,如下所示: {“terms”:{“names”:[“bray”,“nia”]}​​}

但我需要使用这样的全名: {“names”:[“ahmed bray”,“nia walsh”]}}

1 个答案:

答案 0 :(得分:0)

您遇到的“问题”与Elasticsearch如何处理文本字段的行为有关。默认情况下,每个文本字段都使用Standard Tokenizer进行标记,正如您在文档中看到的那样,可以在单词上分割文本。

实现此目的的一个选项是改进默认设置和映射。您需要做的就是在我们的案例中添加multi fieldentire-phrase),这将以不同的方式进行分析并进行搜索。

首先使用以下设置/映射创建索引:

{
  "settings": {
    "analysis": {
      "normalizer": {
        "case_and_accent_insensitive": {
          "filter": [
            "lowercase",
            "asciifolding"
          ]
        }
      }
    }
  },
  "mappings": {
    "family": {
      "properties": {
        "names": {
          "type": "text",
          "fields": {
            "entire-phrase": {
              "type": "keyword",
              "normalizer": "case_and_accent_insensitive"
            }
          }
        }
      }
    }
  }
}

然后你可以搜索你的期望:

{
  "query": {
    "terms": {
      "names.entire-phrase": [
        "ahmed bray",
        "nia walsh"
      ]
    }
  }
}

必须提醒您,此搜索只能通过名字或姓氏找到任何结果。只匹配整个短语。如果您想同时实现这两项,则必须同时按namesnames.entire-phrase字段进行搜索。