Elasticsearch中使用通配符进行不区分大小写的搜索

时间:2018-05-30 10:26:27

标签: elasticsearch

我刚刚开始研究elasticsearch。我有一个索引“new_index”,下面给出了映射:

"new_index" : {
    "aliases" : { },
    "mappings" : {
      "current" : {
        "properties" : {
          "did" : {
            "type" : "integer"
          },
          "fil_date" : {
            "type" : "double"
          },
          "file_nr" : {
            "type" : "double"
          },
          "id" : {
            "type" : "integer"
          },
          "mark_text" : {
            "type" : "text"
          },
          "mark_type_id" : {
            "type" : "text"
          },
          "markdescr" : {
            "type" : "text"
          },
          "markdescrtext" : {
            "type" : "text"
          },
          "niceclmain" : {
            "type" : "double"
          },
          "owname" : {
            "type" : "text"
          },
          "statusapplication" : {
            "type" : "text"
          }
        }
      }
    },
    "settings" : {
      "index" : {
        "creation_date" : "1527665866982",
        "number_of_shards" : "5",
        "number_of_replicas" : "1",
        "uuid" : "Py5uWzVTRYqcZuCLcwm-BQ",
        "version" : {
          "created" : "6020499"
        },
        "provided_name" : "new_index"
      }
    }
  }

现在我想在字段“mark_text”上搜索。我有两种类型的搜索1.如果我搜索“智能”,结果应该只包含不区分大小写的“智能”。 2.它应该搜索我们使用LIKE“%smart%”并且不区分大小写。

我有第二次搜索案例的查询。但是,我想知道是否有任何解决方案可以用于两个搜索案例。

编辑: 我用于搜索案例1的查询是:

GET _search
{
  "query": {
    "bool": {
      "must" : [
        {
          "match": {
            "mark_text": "smart"
          }
        }  
      ]
    }
  }
}

查询搜索案例2:

GET _search
{
  "query": {
    "bool": {
      "must" : [
        {
          "wildcard": {
            "mark_text": "*smart*"
          }
        }  
      ]
    }
  }
}

1 个答案:

答案 0 :(得分:0)

我创建了一个新索引,并添加了如下的映射和设置:

{
  "new_index5" : {
    "aliases" : { },
    "mappings" : {
      "current" : {
        "properties" : {
          "did" : {
            "type" : "integer"
          },
          "fil_date" : {
            "type" : "double"
          },
          "file_nr" : {
            "type" : "double"
          },
          "filing_date" : {
            "type" : "double"
          },
          "id" : {
            "type" : "integer"
          },
          "mark_identification" : {
            "type" : "keyword",
            "normalizer" : "lowercase_normalizer"
          },
          "mark_text" : {
            "type" : "keyword",
            "normalizer" : "lowercase_normalizer"
          },
          "mark_type_id" : {
            "type" : "text"
          },
          "markdescr" : {
            "type" : "text"
          },
          "markdescrtext" : {
            "type" : "text"
          },
          "niceclmain" : {
            "type" : "double"
          },
          "owname" : {
            "type" : "keyword",
            "normalizer" : "lowercase_normalizer"
          },
          "party_name" : {
            "type" : "keyword",
            "normalizer" : "lowercase_normalizer"
          },
          "primary_code" : {
            "type" : "text"
          },
          "registration_date" : {
            "type" : "double"
          },
          "registration_number" : {
            "type" : "double"
          },
          "serial_number" : {
            "type" : "double"
          },
          "status_code" : {
            "type" : "text"
          },
          "statusapplication" : {
            "type" : "text"
          }
        }
      }
    },
    "settings" : {
      "index" : {
        "number_of_shards" : "5",
        "provided_name" : "new_index5",
        "creation_date" : "1527686957833",
        "analysis" : {
          "normalizer" : {
            "lowercase_normalizer" : {
              "filter" : [
                "lowercase"
              ],
              "type" : "custom",
              "char_filter" : [ ]
            }
          }
        },
        "number_of_replicas" : "1",
        "uuid" : "9YdUrs1cSBuqDJmvSPOm6g",
        "version" : {
          "created" : "6020499"
        }
      }
    }
  }
} 

并在我的查询中为第一个搜索案例添加了聚合,如下所示:

GET _search
{
  "query": {
    "bool": {
      "must" : [
        {
          "match": {
              "mark_text": "smart"
          }
        }
      ]
    }
  },
  "aggs": {
    "mark_texts": {
      "terms": {
        "field": "mark_text"
      }
    }
  }
}

它给了我包括" smart"和" SMART"两者。

对于第二个搜索案例,我使用模糊。

我仍然不知道聚合和规范化器如何解决了我的问题。但是,我试图理解它。