Question

我有一个具有以下映射的弹性搜索索引：

PUT /student_detail
{
    "mappings" : {
        "properties" : {
            "id" : { "type" : "long" },
            "name" : { "type" : "text" },
            "email" : { "type" : "text" },
            "age" : { "type" : "text" },
            "status" : { "type" : "text" },
            "tests":{ "type" : "nested" }
        }
    }
}

存储的数据格式如下：

{
  "id": 123,
  "name": "Schwarb",
  "email": "abc@gmail.com",
  "status": "current",
  "age": 14,
  "tests": [
    {
      "test_id": 587,
      "test_score": 10
    },
    {
      "test_id": 588,
      "test_score": 6
    }
  ]
}

我希望能够查询这样的学生，其中“％warb％”之类的名称和“％gmail.com％”之类的电子邮件以及ID为587的测试的得分均大于5，等等。如下所示，不知道实际查询是什么，为此

下的混乱查询表示歉意

GET developer_search/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "name": "abc"
          }
        },
        {
          "nested": {
            "path": "tests",
            "query": {
              "bool": {
                "must": [
                  {
                    "term": {
                      "tests.test_id": IN [587]
                    }
                  },
                   {
                    "term": {
                      "tests.test_score": >= some value
                    }
                  }
                ]
              }
            }
          }
        }
      ]
    }
  }
}

查询必须灵活，以便我们可以输入动态测试ID及其相应的分数过滤器以及嵌套字段（例如年龄，姓名，状态）中的字段

Answer 1

类似的东西吗？

GET student_detail/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "wildcard": {
            "name": {
              "value": "*warb*"
            }
          }
        },
        {
          "wildcard": {
            "email": {
              "value": "*gmail.com*"
            }
          }
        },
        {
          "nested": {
            "path": "tests",
            "query": {
              "bool": {
                "must": [
                  {
                    "term": {
                      "tests.test_id": 587
                    }
                  },
                  {
                    "range": {
                      "tests.test_score": {
                        "gte": 5
                      }
                    }
                  }
                ]
              }
            },
            "inner_hits": {}
          }
        }
      ]
    }
  }
}

内部匹配是您要寻找的。

Answer 2

您必须使用Ngram Tokenizer，因为出于性能原因，不得使用通配符搜索，因此我不建议您使用通配符搜索。

将映射更改为以下内容，您可以在其中创建自己的Analyzer，这在下面的映射中已经完成。

elasticsearch（albiet lucene）如何索引一条语句，首先将语句或段落分解为单词或标记，然后在该特定字段的反向索引中索引这些单词。该过程称为Analysis，该过程仅适用于text数据类型。

因此，现在仅当这些标记在反向索引中可用时，您才能获得文档。

默认情况下，将应用standard analyzer。我所做的是，我创建了自己的分析器并使用了Ngram Tokenizer，它将创建的令牌不仅仅是简单的单词。

Life is beautiful上的默认分析器为life，is，beautiful。

但是使用Ngrams时，Life的令牌将是lif，ife和life

映射：

PUT student_detail
{
  "settings": {
    "analysis": {
      "analyzer": {
        "my_analyzer": {
          "tokenizer": "my_tokenizer"
        }
      },
      "tokenizer": {
        "my_tokenizer": {
          "type": "ngram",
          "min_gram": 3,
          "max_gram": 4,
          "token_chars": [
            "letter",
            "digit"
          ]
        }
      }
    }
  },
    "mappings" : {
        "properties" : {
            "id" : { 
              "type" : "long" 
            },
            "name" : { 
              "type" : "text",
              "analyzer": "my_analyzer",
              "fields": {
                "keyword": {
                  "type": "keyword"
                }
              }
            },
            "email" : { 
              "type" : "text",
              "analyzer": "my_analyzer",
              "fields": {
                "keyword": {
                  "type": "keyword"
                }
              }
            },
            "age" : { 
              "type" : "text"             <--- I am not sure why this is text. Change it to long or int. Would leave this to you
            },
            "status" : { 
              "type" : "text",
              "analyzer": "my_analyzer",
              "fields": {
                "keyword": {
                  "type": "keyword"
                }
              }
            },
            "tests":{ 
              "type" : "nested" 
            }
        }
    }
}

请注意，在上面的映射中，我以name，email和status的关键字形式创建了一个同级字段，如下所示：

"name":{ 
   "type":"text",
   "analyzer":"my_analyzer",
   "fields":{ 
      "keyword":{ 
         "type":"keyword"
      }
   }
}

现在您的查询可能如下所示。

查询：

POST student_detail/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "name": "war"                      <---- Note this. This would even return documents having "Schwarb"
          }
        },
        {
          "match": {
            "email": "gmail"                   <---- Note this
          }
        },
        {
          "nested": {
            "path": "tests",
            "query": {
              "bool": {
                "must": [
                  {
                    "term": {
                      "tests.test_id": 587
                    }
                  },
                  {
                    "range": {
                      "tests.test_score": {
                        "gte": 5
                      }
                    }
                  }
                ]
              }
            }
          }
        }
      ]
    }
  }
}

请注意，对于完全匹配，我将在Term Queries字段中使用keyword，而对于常规搜索或在LIKE中的SQL中使用简单的{{3 }} Match Queries上 提供的 字段中，它们使用了Ngram Tokenizer。

还请注意，对于>=和<=，您需要使用text。

响应：

{
  "took" : 233,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 3.7260926,
    "hits" : [
      {
        "_index" : "student_detail",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 3.7260926,
        "_source" : {
          "id" : 123,
          "name" : "Schwarb",
          "email" : "abc@gmail.com",
          "status" : "current",
          "age" : 14,
          "tests" : [
            {
              "test_id" : 587,
              "test_score" : 10
            },
            {
              "test_id" : 588,
              "test_score" : 6
            }
          ]
        }
      }
    ]
  }
}

请注意，在运行查询时，我会在答复中观察您在问题中提到的文档。

请务必阅读我共享的链接。了解概念至关重要。希望这会有所帮助！

如何使用嵌套字段和非嵌套字段查询Elasticsearch索引

2 个答案:

映射：

查询：

响应：