Question

美好的一天Elasticsearch Gurus！

我正面临着一些关于使用word_delimiter的令牌过滤器的问题，其“ generate_word_parts ”和“ split_on_case_change ”功能。但我对split_on_numerics没有问题。将split_on_numerics转换为true或false将为我提供预期的输出。

以下是我的设置，映射和示例对象：

设定：

{
    "settings": {
        "analysis": {
            "analyzer": {
                "index_analyzer": {
                    "tokenizer": "standard",
                    "filter": ["lowercase", "porter_stem", "my_delimiter"]
                },
                "search_analyzer": {
                    "tokenizer": "standard",
                    "filter": ["lowercase", "porter_stem", "my_delimiter"]
                }
            },
            "filter": {
                "my_delimiter": {
                    "type": "word_delimiter",
                    "generate_word_parts": true,
                    "generate_number_parts" : true,
                    "split_on_case_change": true,
                    "split_on_numerics": true
                }
            }
        }
    }
}

映射：

{
    "picture": {
      "_all" : {"enabled" : true, "index_analyzer": "index_analyzer", "search_analyzer": "search_analyzer"},
      "properties": {
        "id": {
          "type": "string",
          "index": "not_analyzed"
        },
        "title": {
          "type": "string",
          "boost": 7.0,
          "index": "analyzed",
          "index_analyzer": "index_analyzer",
          "search_analyzer": "search_analyzer",
          "store": "yes"
        },
        "description": {
          "type": "string",
          "boost": 1.0,
          "index": "analyzed",
          "index_analyzer": "index_analyzer",
          "search_analyzer": "search_analyzer"
        },
        "featured": {
          "type": "boolean",
          "index": "not_analyzed"
        },
        "categories": {
          "type": "string",
          "boost": 2.0,
          "index_name": "category",
          "index": "analyzed",
          "index_analyzer": "index_analyzer",
          "search_analyzer": "search_analyzer",
          "store": "yes"
        },
        "tags": {
          "type": "string",
          "boost": 4.0,
          "index_name": "tag",
          "index": "analyzed",
          "index_analyzer": "index_analyzer",
          "search_analyzer": "search_analyzer",
          "store": "yes"
        },
        "created_at": {
          "type": "double",
          "index": "not_analyzed"
        }
      }
    }
}

数据：

{
   "id":"4defe0ecf10a8524b8000047",
   "title":"Don Henrico's Road2Mountain",
   "description":"",
   "featured":false,
   "categories":[
      "landscape",
      "nature",
      "traffic"
   ],
   "tags":[
      "everywhere",
      "AmazingSights"
   ],
   "created_at":1307564220.04741
}

然后我进行了以下搜索：

1.) http://localhost:9200/pictures/picture/_search?q=henricos (OK - 1 match)
2.) http://localhost:9200/pictures/picture/_search?q=road (OK - 1 match)
3.) http://localhost:9200/pictures/picture/_search?q=mountain (OK - 1 match)
4.) http://localhost:9200/pictures/picture/_search?q=everywhere (OK - 1 match)
5.) http://localhost:9200/pictures/picture/_search?q=amazing (NOT OK - no match)
6.) http://localhost:9200/pictures/picture/_search?q=sights (NOT OK - no match)

这意味着我的设置[“ generate_word_parts”：true ] 并且[“ split_on_case_change”：true ]不起作用（如5.和6中所示。）。

So my main question is why are these features not working.
Are there some more tweaks I need to do to make them work?
Or this is another concept where I have the wrong understanding?

您的启蒙和建议将受到如此多的赞赏！

非常感谢=）！

P.S。参考文献：

Elasticsearch令牌过滤器，word_delimiter和generate_word_parts

0 个答案: