Elasticsearch shingle令牌过滤器无法正常工作

时间:2016-07-20 14:27:54

标签: elasticsearch shingles

我在本地1.7.5 elasticsearch安装上试试这个

http://localhost:9200/_analyze?filter=shingle&tokenizer=keyword&text=alkis stack

我看到了这个

{
   "tokens":[
      {
         "token":"alkis stack",
         "start_offset":0,
         "end_offset":11,
         "type":"word",
         "position":1
      }
   ]
}

我希望看到类似的东西

{
   "tokens":[
      {
         "token":"alkis stack",
         "start_offset":0,
         "end_offset":11,
         "type":"word",
         "position":1
      },
      {
         "token":"stack alkis",
         "start_offset":0,
         "end_offset":11,
         "type":"word",
         "position":1
      }
   ]
}

我错过了什么吗?

更新

{
  "number_of_shards": 2,
  "number_of_replicas": 0,
  "analysis": {
    "char_filter": {
      "map_special_chars": {
        "type": "mapping",
        "mappings": [
          "- => \\u0020",
          ". => \\u0020",
          "? => \\u0020",
          ", => \\u0020",
          "` => \\u0020",
          "' => \\u0020",
          "\" => \\u0020"
        ]
      }
    },
    "filter": {
      "permutate_fullname": {
        "type": "shingle",
        "max_shingle_size": 4,
        "min_shingle_size": 2,
        "output_unigrams": true,
        "token_separator": " ",
        "filler_token": "_"
      }
    },
    "analyzer": {
      "fullname_analyzer_search": {
        "char_filter": [
          "map_special_chars"
        ],
        "filter": [
          "asciifolding",
          "lowercase",
          "trim"
        ],
        "type": "custom",
        "tokenizer": "keyword"
      },
      "fullname_analyzer_index": {
        "char_filter": [
          "map_special_chars"
        ],
        "filter": [
          "asciifolding",
          "lowercase",
          "trim",
          "permutate_fullname"
        ],
        "type": "custom",
        "tokenizer": "keyword"
      }
    }
  }
}

我试图像这样进行测试

http://localhost:9200/INDEX_NAME/_analyze?analyzer=fullname_analyzer_index&text=alkis stack

1 个答案:

答案 0 :(得分:1)

在ES中的两个单独的字段中索引名字和姓氏,就像在数据库中一样。可以分析作为查询收到的文本(match例如,query_string执行此操作。并且有一些方法可以同时使用搜索字符串中的所有术语搜索这两个字段。我认为你在一次性使用单个名称过度复杂化用例并在索引时创建名称排列。