word_delimiter在文本块上

时间:2016-05-10 20:11:14

标签: elasticsearch

似乎word_delimiter仅适用于单一术语。如果我有一个像下面那样的文本块怎么办?

 "Contra-indications of paracetamol can be of certain sorts"

在这种情况下,word_delimiter接受整个句子并连接它,而我需要它只连接"Contra-indications",以便我可以搜索contra indicationscontra-indicationscontraindications但在一段文字中。

1 个答案:

答案 0 :(得分:1)

你需要一个像这样的分析器:

{
  "settings": {
    "analysis": {
      "filter": {
        "delimiter_filter": {
          "type": "word_delimiter",
          "catenate_words": true,
          "preserve_original": true
        }
      },
      "analyzer": {
        "delimiter_analyzer": {
          "type": "custom",
          "tokenizer": "whitespace",
          "filter": [
            "lowercase",
            "delimiter_filter"
          ]
        }
      }
    }
  },
  "mappings": {
    "assets": {
      "properties": {
        "domain": {
          "type": "string",
          "analyzer": "delimiter_analyzer"
        }
      }
    }
  }
}

对于您的示例文本 - Contra-indications of paracetamol can be of certain sorts - 这些是它产生的术语:

           "domain": [
              "be",
              "can",
              "certain",
              "contra",
              "contra-indications",
              "contraindications",
              "indications",
              "of",
              "paracetamol",
              "sorts"
           ]