Elasticsearch - Tokenizer配置

时间:2015-07-01 16:47:32

标签: regex elasticsearch

有人知道要使用什么标记器以及如何为下面的规则启用规则,

输入:[“test1-data.example.com”,“test2-new.example.com”,“new1-test.example.com”]

输出(预期):
   test1-data.example.com test2-new.example.com new1-test.exampl.com

1 个答案:

答案 0 :(得分:0)

它是否能解决你的问题并不明显,但这是你可以做的一种方式,你会问:

DELETE /test_index

PUT /test_index
{
   "settings": {
      "number_of_shards": 1
   },
   "mappings": {
      "doc": {
         "_all": {
            "enabled": true,
            "store": true,
            "index": "not_analyzed"
         },
         "properties": {
            "text_field": {
               "type": "string",
               "include_in_all": true
            }
         }
      }
   }
}

PUT /test_index/doc/1
{
    "text_field": ["test1-data.example.com", "test2-new.example.com", "new1-test.example.com"]
}

POST /test_index/_search
{
    "fields": [
       "_all"
    ]
}
...
{
   "took": 3,
   "timed_out": false,
   "_shards": {
      "total": 5,
      "successful": 5,
      "failed": 0
   },
   "hits": {
      "total": 1,
      "max_score": 1,
      "hits": [
         {
            "_index": "test_index",
            "_type": "doc",
            "_id": "1",
            "_score": 1,
            "fields": {
               "_all": "test1-data.example.com test2-new.example.com new1-test.example.com "
            }
         }
      ]
   }
}

以下是Sense中的代码:

http://sense.qbox.io/gist/45200711a41268634439b669e18541e68042ac8a