Question

我有一个弹性搜索字段，我使用ngram tokenizer进行索引。出乎意料的是，elasticsearch没有合并相邻的亮点。例如，对于搜索字词854511，我得到以下重点

DA V50 v335 auf v331 J06A <mark>85</mark><mark>45</mark><mark>11</mark>

虽然我期待这个

DA V50 v335 auf v331 J06A <mark>854511</mark>

以下是我的分析仪：

ADDITIONAL_ANALYZERS = {
analyzer: {
      ngram_analyzer: {
          tokenizer: :ngram_tokenizer,
          filter: 'lowercase'
      }},
tokenizer: {
      ngram_tokenizer: {type: :nGram,
                        min_gram: 2,
                        max_gram: 20,
                        token_chars: [ 'letter', 'digit', 'symbol', 'punctuation' ]
      }}
}

settings analysis: ADDITIONAL_ANALYZERS do
  mappings do
    indexes :name, type: 'multi_field' do
      indexes :name, type: :string, analyzer: :ngram_analyzer, term_vector: :with_positions_offsets
      indexes :not_analyzed, type: :string, index: :not_analyzed
    end
    indexes :mdc, type: :string, index: :not_analyzed
    indexes :description, type: :string, analyzer: :html_ngram_analyzer, term_vector: :with_positions_offsets
    indexes :created_at, type: :date
  end
end

Answer 1

尝试使用plain荧光笔。

如果您尝试以下查询：

{
  "query": {
    "match": {
      "name": "854511"
    }
  },
  "highlight": {
    "fields": {
      "name": {
        "pre_tags": [
          "<mark>"
        ],
        "post_tags": [
          "</mark>"
        ],
        "fragment_size": 150,
        "number_of_fragments": 1,
        "type": "plain"
      }
    }
  }
}

你得到了理想的结果：

"hits": [
  {
    "_index": "test",
    "_type": "test",
    "_id": "1",
    "_score": 0.1856931,
    "_source": {
      "name": "DA V50 v335 auf v331 J06A 854511"
    },
    "highlight": {
      "name": [
        "DA V50 v335 auf v331 J06A <mark>854511</mark>"
      ]
    }
  }
]

Elasticsearch没有合并亮点

1 个答案: