ElasticSearch:添加规范化器,用于对关键字进行不区分大小写的搜索

时间:2017-11-02 00:22:19

标签: elasticsearch

我有一个大的ES索引,动态创建"关键字"领域。我需要对这些进行不区分大小写的搜索。我理解分析器不适用于关键字字段,并且将使用规范化器:https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-normalizers.html

有没有办法动态地为字段/映射添加规范化器?我可以通过关闭索引,添加分析器并重新打开索引,将分析器添加到现有文本字段。添加规范化器时,这似乎不适用于现有索引。除了创建另一个索引以重新索引所有数据之外,有没有办法做到这一点?

以下是我的步骤: 使用小写规范化器创建测试索引:

curl -XPUT localhost:9200/ganesh_index/ -d '
{
  "settings": {
    "analysis": {
      "normalizer": {
        "useLowercase": {
          "type": "custom",
          "filter": [ "lowercase" ]
        }
      }
    }
  },
  "mappings":{
     "ganesh_type":{
        "properties":{
           "title":{
              "normalizer":"useLowercase",
              "type":"keyword"
           }
        }
     }
  }
}'

现在,我可以根据需要插入和查询:

curl -X PUT localhost:9200/ganesh_index/ganesh_type/1 -d '{"title":"ThisFox.StatusCode1"}'
curl -X PUT localhost:9200/ganesh_index/ganesh_type/2 -d '{"title":"ThisFox.StatusCode2"}'

curl -X POST 'localhost:9200/ganesh_index/_search?pretty' -d '{"query": {"regexp":{"title": "this.*code1"}}}'
{
  "took" : 24,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "ganesh_index",
        "_type" : "ganesh_type",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "title" : "ThisFox.StatusCode1"
        }
      }
    ]
  }
}

但是,如果我的索引已经存在,那么:

curl -X PUT localhost:9200/ganesh_index -d '
{
  "settings": {
    "index": {
      "number_of_shards": 2,
      "number_of_replicas": 2
    }
  }
}'

我插入记录,我以后无法添加规范化器。

curl -XPUT localhost:9200/ganesh_index/?pretty -d '
> {
>   "settings": {
>     "analysis": {
>       "normalizer": {
>         "useLowercase": {
>           "type": "custom",
>           "filter": [ "lowercase" ]
>         }
>       }
>     }
>   },
>   "mappings":{
>      "ganesh_type":{
>         "properties":{
>            "title":{
>               "normalizer":"useLowercase",
>               "type":"keyword"
>            }
>         }
>      }
>   }
> }'
{
  "error" : {
    "root_cause" : [
      {
        "type" : "index_already_exists_exception",
        "reason" : "index [ganesh_index/mg5TckzaR5KZDE-FphTeDg] already exists",
        "index_uuid" : "mg5TckzaR5KZDE-FphTeDg",
        "index" : "ganesh_index"
      }
    ],
    "type" : "index_already_exists_exception",
    "reason" : "index [ganesh_index/mg5TckzaR5KZDE-FphTeDg] already exists",
    "index_uuid" : "mg5TckzaR5KZDE-FphTeDg",
    "index" : "ganesh_index"
  },
  "status" : 400
}

有没有办法为现有索引添加规范化器(在关键字字段上)?

2 个答案:

答案 0 :(得分:0)

不,您必须重新编制索引或创建新映射。

答案 1 :(得分:0)

当前,Elasticsearch不支持此类活动。 即使您这样做了,也会给我们一个信息。

 {
  "error": {
    "root_cause": [
      {
        "type": "resource_already_exists_exception",
        "reason": "index [category_video_autocomplete_3/FkxOwP_RQMW_L077hYLPJg] already exists",
        "index_uuid": "FkxOwP_RQMW_L077hYLPJg",
        "index": "category_video_autocomplete_3"
      }
    ],
    "type": "resource_already_exists_exception",
    "reason": "index [category_video_autocomplete_3/FkxOwP_RQMW_L077hYLPJg] already exists",
    "index_uuid": "FkxOwP_RQMW_L077hYLPJg",
    "index": "category_video_autocomplete_3"
  },
  "status": 400
}

该消息看起来很复杂,但是仔细观察可以发现

resource_already_exists_exception

意味着您要创建的资源已经存在,因此我们无法创建相同的资源,这里的资源表示索引名为category_video_autocomplete_3。